Labs and Canvas Posts

At the start of almost every class, you will confer with a group of 3-5 about your lab results or Canvas post. This serves two purposes. 1) You get a chance to see if you’re working through the labs in a similar or different way from your peers and you get to “confer” on answers. 2) If you decide to form a project team, you know the technical and analytical strengths of your peers which can help you form a team that allows equal representation of different skill sets

About the Labs: These enable you to get hands-on experience with the tools and technologies used in the digital humanities. They provide an opportunity to learn about these tools in a low-stakes environment, allowing you to experiment and explore without the pressure of producing a finished product. By working through the labs, you can better understand the concepts and technologies involved, and learn which tools might be useful in your future research.

You should submit a link or screenshot of what you worked on/built.

Labs are due before class. No exceptions (unless stated on the syllabus)

About Canvas Posts: These posts should be about 500 words. Prompts for the canvas posts can be found below. The goal of these responses is not simply to demonstrate that you have carefully read and considered the readings with a critical eye (that is assumed) or to provide summaries. Rather, these responses will form the raw materials for our class discussions, and you should use them as an opportunity to share candid impressions, questions, and things that you find puzzling, interesting, or contradictory.

Canvas posts are due by 6 pm the evening before class. No exceptions (unless stated on the syllabus)


 

Canvas Post 0

Create a Canvas thread. This will be your digital repository for the semester. Please make the title of the thread your name. Write a very brief (less than 300 words) introduction. Include a sentence or two about what you think you’ll build this semester. Throughout the semester, all of your Canvas Posts and Labs will be submitted in your personal Canvas thread.

This is an example of your digital repository for the semester.


Canvas Post 1

From the list of projects, choose (a) one or two of the projects, and (b) one or two of the student-created projects.

For each of your chosen projects,

  1. Interact with the site and its tool(s) until you have a good feeling for the project’s scope, goals, and capabilities.

  2. Read any associated “About” information to get a sense of the who the project’s creators are (and how many of them there are), what tools were used in the creation, what its data sources are, and any sources of funding.

A few questions you might want to consider:

  • Who is the intended audience? Scholarly researchers, students, the general public, a combination?

  • Would this project be a useful tool for research, for teaching, for both? How might it be used to generate new research questions -- or answers?

  • What scholarly decisions were made in the design of the underlying dataset, the metadata, the visualization, and/or the user interface?

  • If the underlying data is still being compiled, how is it being compiled? Are the creators using crowdsourcing or public engagement methods for collecting data?

  • Are the project’s creator(s) writing (or have they already written) published works derived from their digital project?

  • Do the creators offer their data or code for reuse by others?

  • In what ways do find the project successful or unsuccessful?

  • What are your questions?

For your canvas post:

Think about the data (or sources) that will underlie your own project. What issues do you foresee you may need to address as you assemble, analyze, and present your data?

  • What are some factors that are important to the success of a digital project that you want to replicate in your own project?

  • What do you want to understand better?

  • What do you want to know how to do?

Feel free to refer to some of the projects you looked at in preparation for class.


Canvas Post 2

In what ways do the projects we examined in the prior class utilize no-code, low-code, or minimal computing?

What do you understand by the terms and how you might implement them into a project?

What are the ethical and practical implications of low/no-code and minimal computing for the type of project you might imagine undertaking, if there are any?

How do the language of minimal computing, no-code, and low-code impact your understanding of the digital humanities?


Lab 1

Some questions to consider:

  • Does your data have metadata? Do you need to create metadata or additional metadata?

  • What metadata might be useful or necessary for you to perform the analysis and/or create the interface that you plan?

  • What do you wish you understood better?

Download

OpenRefine [https://openrefine.org/]. This software cleans data.

Data we will be wrangling:

·      https://en.wikipedia.org/wiki/Heinemann_African_Writers_Series

·      https://www.goodreads.com/list/show/73176.African_Writers_Series


Lab 2

Option 1: If you know the project you’ll be working on this semester

Begin to create a data set for a project you might pursue during this class. The corpus should include at least 25 “entries,” but it could go up to 10,000+ depending on what you’re thinking about. You can choose to create the dataset from scratch or combine different ones you found online that fit your chosen project.

Use OpenRefine to start organizing your data.

Please be prepared to discuss what you collected, the metadata you choose to collect, and possible use cases for the data. Also be ready to talk about your experiences sourcing, wrangling, and cleaning the data. Write a 250-500 word reflection addressing these points.

Option 2: If you don’t know what project you’ll be working on this semester

Download this data set. [LINK]

Directions: When you upload the data, make sure to click CSV (not TSV). This will make sure that OpenRefine puts the data into a table rather than a string of numbers and text.

Try the following:

Remove duplicates, place data into different columns, name your columns, remove grammatical errors/misspellings, Have the date first, followed by the author, and then followed by other data that you find to be important.

Everyone:

1) Document how you uploaded the data (a link, spreadsheet, CSV, copy-paste, etc)

2) Take a screenshot of what your data looks like pre-cleaning.

3) Take a screenshot of the cleaning in-progress (2-3 screenshots would suffice)

4) Once you’re done cleaning the data, download it as an Excel sheet.

5) Take a screenshot of the final document (everyone’s might look a bit different)

6) Please be prepared to discuss what you collected, the metadata you choose to include, and possible use cases for the data. Also be ready to talk about your experiences sourcing, wrangling, and cleaning the data. Write a 250-500 word reflection addressing these points.


Lab 3

This is my personal dataset of the Afro-Asian Little Magazine, Lotus. Keeping in mind our earlier conversations about metadata, from the Lotus corpus, choose the metadata you will map. Why have you chosen these terms? What question does this specific compilation of metadata aim to answer?

Examine both options before making a choice. I personally find Onodo to be more user-friendly. Gephi has more power and is the more frequently used network analysis tool.

Option 1:

Create the network model using Onodo https://onodo.org/. Play around with the tool and try to manipulate the data at least two different ways. Please document these two manipulation examples. What observations can you make from it? How do your observations change during the second manipulation?

Option 2:

Work through Par Martin Grandjean's Gephi tutorial ("Introduction to Network Visualization with GEPHI"). 

If you want to install Gephi on your own machine:

  1. (You may also be interested in an article explaining the frequently used "ForceAtlas2" layout option for Gephi visualizations.  The article is technical, but gives a sense of what would be involved in unlocking the "black box" of concepts behind such algorithms: Mathieu Jacomy, et al. , "ForceAtlas2, a Continuous Graph Layout Algorithm for Handy Network Visualization Designed for the Gephi Software" [2014])

Option 3

http://hdlab.stanford.edu/palladio/

Lotus Data [Download]


Lab 4

Ideas of what to engage with:

  • Create a map of something that is not necessarily (or traditionally thought of as) mappable.

  • Create a map related to issues of sovereignty as discussed in the “Visualizing Sovereignty” article.

  • Create a map of a novel, an author’s works, or some other data.

  • Basically, use one of the visualization tools or platforms to create an interesting visualization or diagram of something (e.g., a text, a dataset, etc.).

  • Using one of the visualization tools or platforms listed at the bottom of this lab, create an interesting visualization or diagram of something (e.g., a text, a dataset, etc.).  Some multiple-purpose visualization tools include ManyEyes, Tableau Public, yEd.

Go to Storymap JS and review the website.

  • Click the green “Make a Storymap now” button. Follow the instructions to create points on the map from your Timeline elements.

  • When your map is complete (or even before it is finished), click on the Share button and scroll down.

  • Cut the Embed code from the window and past it into your Canvas discussion post. Your Storymap should now render in your Canvas discussion post below your timeline and link to your Google Docs spreadsheet. 

  1. Using the StoryMap JS online tool from the Northwestern U. Knight Lab, show how you could tell a good story (or argument) based on a life, literary work, historical event, contemporary event, social phenomenon, or abstract/theoretical concept.  StoryMap creates flow maps (interactive maps that zoom from location to location with associated images/text called up at with each point: example).  (When asked by StoryMap "What type of story do you want to create?", choose the "map" option, which allows you to use a ready-made zoomable map of the world.)  Your goal is to demonstrate how mapping can add value or a different perspective to textual narrative/argument.  Your map-story need only contain a few points with associated images and text--enough to mock up what you would do with more time.  Try to do something interesting that allows us to think about how mapping interacts with or differs from textual narrative, what it adds and what it takes away, etc. (Note: the StoryMap JS tool is free, but it requires that you have a Google account because it uses Google Drive.)

  2. An extra but not necessary step (which will be useful to those of you interested in historical maps) : Using StoryMap JS, upload your own map, image, or photo to use as an interactive, zoomable visualization on which to tell a story or argument.  (When asked by StoryMap "What type of story do you want to create?", choose the "Gigapixel" option."  This is an option that requires a few more technical steps; but it opens up many more imaginative possibilities--e.g., the ability to tell a story/argument by moving around a fictional map, a historical map, a photograph of a landscape, a group portrait, a painting, etc.: example).

    If you choose to build your own map background (which you do not have to do), do the following:

    1. First, you need a map or image.  For example, you can get many thousands of resources from the David Rumsey Map Collection (historical maps) or the Folger Library Digital Image Collection (both use a platform called the Lava browser that has an "export" function producing a downloadable zip file of each image).

    2. Then you need to process the map or image into a "tiled" form by using Zoomify through Photoshop (how-to), the Zoomify program, or another means (instructions).  (See also a video tutorial.)  The Zoomify process creates a folder (with subfolders) of tiled or sectioned parts of your image.

    3. Then you need to move the folder containing the Zoomified, tiled version of your image to your Google Drive, share the folder publicly, and note the "hosting" base URL by which Google Drive can serve up the folder on the Web (instructions).

    4. Finally, in StoryMap create a new storymap by choosing the "Gigapixel" option and inputting the hosting base URL and also the size in pixels of your original image. (See video tutorial.)

    5. Once the storymap is created, you can add locations with images/text on a slide-by-slide basis.

OR

Use one of the following:

http://hdlab.stanford.edu/palladio/

Chartsbin http://chartsbin.com/about/apply

Clio https://theclio.com/

Google Earth (click the new project button on the sidebar) https://earth.google.com/web/@49.83703,24.0255735,282.13611875a,652.42796915d,35y,0h,45t,0r/data=ChIaEAoKL20vMGdnNF85ZBgCIAE

Google Earth Studio google.com/earth/studio/

Google Earth Engine https://earthengine.google.com/

Odyssey http://cartodb.github.io/odyssey.js/

QGis https://www.qgis.org/en/site/index.html


Lab 5

Play with three or more of these tools (list designed by Alan Liu). If you need a dataset, you can use one of your own or one from the ToolBox. At least one of the three tools you use must be from TaPOR.

TaPOR (Text Analysis Portal) (collection of online text-analysis tools--ranging from the basic to sophisticated)

  • TaPOR 2.0 (current, redesigned TAPoR portal; includes tool descriptions and reviews; also includes documentation of some historical or legacy tools)

Google Ngram Viewer (search for and visualize trends of words and phrases in the Google Books corpus; includes the ability to focus on parts of the corpus [e.g., "American English," "English Fiction"] and to use a variety of Boolean and other search operators); see the related article: Jean-Baptiste Michel, Erez Lieberman Aiden, et al., "Quantitative Analysis of Culture Using Millions of Digitized Books" (2011)

  • See also: Bookworm (interface for Google Ngram visualization of trends in a select number of corpora: Open Library books; ArXiV science puplications; Chronicling America historical newspapers; US Congress bills, amendments, and resolutions; Social Science Research Network research paper abstracts

HathiTrust Research Center (HTRC) Portal (allows registered users to search the HathiTrust's ~3 million public domain works, create collections, upload worksets of datra in CSV format, and perform algorithmic analysis -- e.g., word clouds, semantic analysis, topic modeling) (Sign-up for login to HTRC  portal; parts of the search and analysis platform requiring institutional membershkp also require a userid for the user's university)

  • Features Extracted From the HTRC ("A great deal of fruitful research can be performed using non-consumptive pre-extracted features. For this reason, HTRC has put together a select set of page-level features extracted from the HathiTrust's non-Google-digitized public domain volumes. The source texts for this set of feature files are primarily in English.  Features are notable or informative characteristics of the text. We have processed a number of useful features, including part-of-speech tagged token counts, header and footer identification, and various line-level information. This is all provided per-page.... The primary pre-calculated feature that we are providing is the token (unigram) count, on a per-page basis"; data is returned in JSON format)

Bookworm: https://bookworm.htrc.illinois.edu/develop/

URCEL API: http://ucrel-api.lancaster.ac.uk/claws/free.html

 Juxta Commons ("a tool that allows you to compare and collate versions of the same textual work")

Poem Viewer ("web-based tool for visualizing poems in support of close reading")

Voyant Tools (Online text reading and analysis environment with multiple capabilities that presents statistics, concordance views, visualiztions, and other analytical perspectives on texts in a dashboard-like interface. Works plain text, HTML, XML, PDF, RTF, and MS Word files (multiple files best uploaded as a zip file). Also comes with two pre-loaded sets of texts to work on (Shakespeare's works and the Humanist List archives [click the “Open” button on the main page to see these sets])


Canvas Post 3

Keeping in mind all the concepts we’ve engaged with this semester, write a post that succinctly traces the similarities, differences, purposes, weaknesses, and strengths of the two digital examples we’re examining today. Engage with the essays we also read for today to inform your response. 500-750 words


Canvas Post 4

Please follow any train of thought that is generative to you.

Here is a suggestion: What are the relationships between digital media, digital methods, and the digital humanities? What other terms we might use to describe the constellation of digital projects we’ve encountered thus far? What is the relationship between the materials we are looking at for today and those we’ve engaged with this semester? How does the inclusion of these texts complicate how we might approach the digital humanities as a field of inquiry?

500-750 words


Canvas Post 5

Describe the digital project you believe you’ll pursue.

Consider Your Project’s Data. What is the raw research material that you might like to use in a project?  In DH research, we call this material your “data” – even if you don’t think of it that way! Your data might be collections of images (photos, artworks, documentary images, etc.), text (emails, periodicals, social media, books, poems, plays, etc.), tabular data (geospatial locations, statistics, Census results, etc.), combinations of several of these types of data, or something else altogether. Does your data already exist? If so, is it in analog or digital format currently? If not, do you need to produce your data?

Consider Your Project’s Research Approach. Please read the excellent “Topics in DH” page of the online Digital Humanities Literacy Guidebook (Weingard, Grunewald, & Lincoln, 2020). This resource provides a good sense of the wide variety of tools and methods that can be included under the very broad umbrella of “digital humanities.” I am prepared to support projects under most of the research categories or at least help you find resources to support your work, but at this time I can’t help with 3D modeling, computational linguistics, corpus linguistics, critical code studies, digital forensics, stylometry, machine learning, and virtual reality.

Your Canvas post should be roughly 500-750 words.


Canvas Post 6

Edited via Cornell digital humanities lab

Omeka/Neatline

Omeka really shines in its ability to manage collections of archival objects (including photos, audio, and video), and to create exhibitions from them.

Neatline is a sophisticated plugin for Omeka that allows mapping. Check out some of the example projects made with Neatline, and compare with Carto. One difference between the two is that Omeka allows much more freeform annotation, and various ways of embedding collections of objects within the map. Neatline also offers timeline tools.

Omeka.net is the hosted version of Omeka. Anyone may create a small Omeka.net project for free (with limited plugins).

However, Omeka.net does not support Neatline, so if you want to use it, you'll have to host your own installation Omeka (Omeka.org), using server space at Reclaim Hosting or elsewhere.

For Omeka usage instructions, see: The Programming Historian's Up and Running with Omeka.net

Note: Omeka can be a great pedagogical tool, in part because the interface is so simple to use.

Scalar

Also see: Scalar User Guide

Wordpress

Also see: Wordpress.com Features

Carto

Also see: Carto Builder Overview

As you explore the projects, please keep the following guiding questions in mind (and please write a summary of your thoughts on these questions):

  • What kind of projects would be a good fit for the features of this platform?

  • Can you identify the type(s) of data/metadata being used?

  • In how many ways is the data made available to others? (For instance, how might a researcher download the data to use in a new project?)

  • If you imagine this project in the coming years (decades?), what might be problems that come up over time? Are there areas of concern for sustainability?

  • Where would your project be hosted (and by whom?) if you used this platform? Who has ownership, control, and rights?

  • How "safe" is your data here? Does the platform allow you to protect your data in some fashion?

  • Is it possible to truly restrict access? Or is it security by obscurity?

    500-750 words