Basics Steps to Digital Management of Data --- Cultural ...

Basics Steps to Digital Management of Data --- Cultural ...

Bringing Digital Data Management Training into Methods Courses for Anthropology Cultural Anthropology: Principles and Practices of Digital Data Management Kathryn Oths 2016 Recommended citation: Oths, Kathryn. Cultural Anthropology: Principles and Practices of Digital Data Management. In Bringing Digital Data Management Training into Methods Courses for Anthropology, edited by Blenda Femenas. Arlington, VA: American Anthropological Association, 2016. American Anthropological Association 2016 This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License. Bringing Digital Data Management Training into Methods Courses for Anthropology is a set of five modules: General Principles and Practices of Digital Data Management Archaeology: Principles and Practices of Digital Data Management

Biological Anthropology: Principles and Practices of Digital Data Management Cultural Anthropology: Principles and Practices of Digital Data Management Linguistic Anthropology: Principles and Practices of Digital Data Management Project support: National Science Foundation, Workshop Grant 1529315; Jeffrey Mantz, Director, Cultural Anthropology Program 2 Organization I. Review of material from General principles and practices module II. Why is it important to preserve and protect your data? III. Data types in cultural anthropology IV. Managing data V. Software

VI. Data archiving VII. Exercises VIII. References IX. Acknowledgments 3 Review of material from General principles and practices module What are data? What is data management? What are the advantages of making data accessible? What are ethical dimensions of data management? What is a data management plan? 4 Why is it important to preserve and protect your data? Data collected in cultural anthropology research represents the cultural

expressions and diversity of a people. Because all cultures always change, the anthropologist is capturing a unique moment in time. Collecting data is a tremendous privilege. Anthropologists have an ethical obligation to protect the data they collect. Without advance planning to preserve and protect, valuable data may be lost. In one sad case, 40 years of ethnographic research notes ended up in a dumpster upon the anthropologists death because no provision had been made to

archive them. Photograph by Christine O. Masson and Tracy Jaeger. Used with permission 5 Ethical dimensions of cultural anthropology data collection and management Data collected today may be all that is available to future anthropologists. Data preservation and protecting the confidentiality of respondents are equally important. Anthropologists must negotiate shared access to the products of their research. Research participants must be informed about data archiving and access, and about participant identification. Areas of interaction with Institutional Review Board (IRB) and community of study at the earliest stages of research design:

Informed consent for archiving and sharing of data Negotiation of access and availability of data with research subjects and/or community Issues of intellectual property 6 Data types in cultural anthropology What is the nature of data? Office of Management and Budget definition of research data: the recorded factual material commonly accepted in the scientific community as necessary to validate research findings. https://, Subpart B .36(d)(2)(i) In general, this means visual, textual and numerical data. What is metadata? Metadata is structured information that describes, explains, locates, or otherwise makes it easier to retrieve, use, or manage an information resource (NISO 2004). In short, it is data about data.

Metadata is necessary for a third party to make sense of your body of work: the overall organizational structure of, and relationships among, the various data files, data forms, and data sets you have created. 7 Data types in cultural anthropology Field notes Interview transcripts Audio recordings (tapes, discs) Visual recordings (films, photographs, videotapes) Letters PDFs, other document forms Drawings Margaret Mead and Gregory Bateson working in the mosquito

room, Tambunam, 1938 Photograph by Gregory Bateson 8 Data types: Born digital compared to made digital Made digital: data that were collected in other than digital formats. Made digital data require labor-intensive conversion into digital formats. Specialists at the National Anthropological Archives devote considerable time converting hand-written data of earlier scholars to PDF and creating metadata forms Photographs are converted using standard formats; see IPTC Metadata Standards. Born digital: With the new technologies available, research can

be designed as digital from the very beginning of the process. Get in the habit of backing up: No matter what information you collect, be sure there is a secure, second copy somewhere. No fear: Sharing data may seem like a new concept, but it was common to include data appendices with publications in the early twentieth century. 9 Managing data Data management provides necessary ways to make data:

Interpretable Systematic Codified Portable Durable Retrievable Perpetual Sharable But I dont do data! This is a common misconception of cultural anthropologists. If you gather any information, such as fieldnotes or photographs, while doing fieldwork, then you have data. http:// s/open-laptop-computer.html 10

Managing data: Case study of re-use Clarence Gravlee re-used data to revisit earlier, hotly contested findings of Franz Boas (1912) and Marvin Harris (1970) about race. Boas measured cephalic index of immigrant groups to dispel notions of racial heredity. Harris demonstrated the ambiguity and cultural construction of Brazilian racial classifications. While the original works were innovative and carefully done, the analytic methods needed to answer the questions fully did not yet exist. One method Gravlee used was analysis of variance. 11 Managing data: Case study of re-use Detail of a page of

Boass data in Materials for the Study of Inheritance in Man. In Gravlee et al. 2003. Used with permission of the American Anthropological Association. Gravlee and his co-authors (2003, 2005), using modern statistical methods, both substantiated and refined the original findings. Boass reanalyzed data were in raw form; Harriss original stimuli were replicated. Both are digitized and available online (, and have been used by other researchers. 12 Managing data: Basic steps Think about ways to make data legible and meaningful to others beyond yourself and/or your research group.

Anonymize the data: Two sets of notes, one anonymized, may be required, one with sensitive data, one without. Implement a system from the start to anonymize data. Use a separate key that links names with ID numbers. Include contextual information: Where, when, who, and how data were recorded Demographics, e.g., neighborhood, gender, age Provide links between qualitative and quantitative data: Often can use the same basic sampling parameters in both qualitative and quantitative data sets. 13 Software: Text analysis

Functions of text analysis software Aids in the interpretation and management of large amounts of textual, graphical, audio, or video data Aids in identifying and using data Any passagefrom a word to a full sectioncan be tagged with a code. Coding facilitates accurate and easy retrieval and/or comparison to other passages. [In-class exercise: Discussion of data collection and analysis] 14 Software: Text analysis Common packages used by anthropologists: Open source: AnSWR (CDC), AQUAD 7, CATMA, ELAN, EZText (CDC), QCAmap, QDA Miner Lite

Proprietary: all have graphical user interface Atlas.ti, Dedoose, DiscoverText, Ethnograph, HyperRESEARCH, Maxqda, Nvivo, QDAMiner, Quirkos Data files in all programs are exportable to portable file types such as .txt and XML formats to ensure cross-platform readability. 15 Numerical data Types of data that can be input: Field notes (content-analyzed) Interview transcripts (content-analyzed) Case studies (content-analyzed)

Survey data Anthropometric data, e.g., height, weight, body fat Biomarker measurements, e.g., salivary cortisol, blood pressure It is important to learn how to code, enter, and clean numerical (and some text) data using a standard statistical package designed for the social sciences. Content analysis: A research technique for making replicable and valid inferences by interpreting and coding textual material This allows qualitative data to be converted to quantitative form. 16 Software: Numerical data analysis Common packages used by anthropologists Open source:

PSPP: graphical user interface R: syntax-driven Proprietary: Anthropac, SPSS, SAS, STATA, SYSTAT, UCINET While proprietary statistical packages may have more advanced features, open source packages are no cost and work well. Data files in all programs are exportable to portable Excelfriendly formats such as .rtf or .txt to ensure cross-platform readability. 17 Steps to creating digital data: The codebook One way to store data in digital form is by numerical codes. A code is a symbolic (typically, numerical) representation of a bit of meaningful information.

Note carefully that coding does not diminish, destroy, or dehumanize the original data set. Coding simply stores the data in an accessible, and ultimately manipulable, in an alternate form. 18 Steps to creating digital data: The codebook A codebook is a guide to the chosen codes that is created after they are assigned. allows for share these newly assigned meanings with others. A codebook is an efficient tool for organizing the information needed to properly record the data codes. for your immediate use. to make data intelligible to others who may wish to share them, now and in the future. [Outside-class exercise: Creating a codebook]

19 Data archiving What to do with data once it is manageable? Types of archives: Private: Secure multi-media backup in a protected environment, especially for the original raw data set Public: Digital archiving Digital archive within a U.S. university, such as http:// A national archive Registry of Anthropological Data Wiki http :// 20 Data archiving Why archive? Private: For security

Public: Sharing of data to Enhance open scientific inquiry Promote new research Encourage diversity of approaches to data analysis Allow others to test new or alternative hypotheses Archiving helps move us from lone-wolf researchers to a community of social scientists. 21 In-class exercise: Discussion of data collection 1. What types of data have you generated in the field? at your office or home base?

2. For each situation: How did/do you protect your data? What methods of back-up did/do you use? 3. Are there any types of data for which you currently do not have a back-up plan? 4. With all candor, describe a time when you lost data due to insufficient protection. 5. Think about your most recent data collection instrument. What identifying information do you have on it? If you drop some field notes walking back from the library, will someone be able to return them to you? 6. Picture your most recent data files. What identifying information exists on them? If someone finds your data files 100 years later, will that person know what they are and how to interpret them?

Learn about and discuss one text analysis software: Click here for a brief tutorial of CATMA 22 Outside-class exercise: Creating a codebook At the top of the codebook, be sure to include general info identifying the project. Variable names are in CAPS, and each is ideally from 28 characters. Missing data is coded as a series of 9s that exceeds the highest value possible. Example for Age: since someone could be 99 years old, the missing value for Age would be 999. The first variable is always CASEID, the unique identifying number that each case carries. 23 Outside-class exercise: Creating a codebook

Example to use as a template: Note: The codebook can be created using PSPP; many tutorials are available online. 24 Outside-class exercise: Creating a codebook Each bit of data is a variable (i.e., the information will vary in value across your cases) and will need to be defined. The 3 types of variable are: Nominal: the least complex; it names something using categories, with no logical order; e.g., Gender. Ordinal: a measurement using categories, in which the categories are logically ordered though not necessarily of equal value; e.g., Illness Gravity. Continuous (aka interval): the most complex; measurement on a continuous scale, with each interval of equal value; e.g., Age. Text variables are not coded, and may be entered as is (string). A format indicates in what form a variable is coded. F for Numerical Data: Fx.y, where F means numerical, X the maximum # of digits the highest value can have, and Y the # of decimal points. If the highest

age possible (in round years, no fractions) is 99, the format will be F2.0. A for Text Data: A#, where A stands for text, and # indicates the number of characters (including spaces) allotted to that variable. Allow for the longest 25 possible answer, e.g., Teodolinda: A10 Outside-class exercise: Creating a codebook Andean Highlander Demographics and Recent Illness History: Using these sample data, create your own codebook for the variables Name, Age, Gender and Gravity of Illness. 1 At 32 years of age, Teodolinda is still living at home, nearly despairing of finding a husband. Her mother is worried that her pena (sadness) is to the point she cannot function well, and would like her to see a curandero for healing. 2 Daniel is 19, single, and the best soccer player his community has ever produced. As long as he stays healthy, everyone thinks he has a shot at playing for the national team.

Fidelita and her husband Raul would prefer to remain in the highlands and tend their crops and sheep, despite the pleas of their kids to come live with them on the coast, where they promise to get her treatment for her occasional skin allergies. 3 4 Azucena, 60 and recently widowed, is accompanied by two of her young grandchildren while their parents work in the city. She has dizzy spells that the doctor has said is due to extremely high blood pressure, though she thinks it is caused by mal viento (evil wind). 5 Since Jorges wife died last year, there is no one to help around the house. Despite his advanced age of 99, he must ride to the market town each Sunday to get supplies. The last time, he fell off his burro and hurt his back, and is now bedridden with no family to care for him. 6

Luca had a daughter, Claudia, with her childhood sweetheart. Her parents disapproved of the union, so at 27 years of age she gave birth at her sisters house without proper care from a midwife, which has led to the herbalists diagnosis of a bit of debilidad (debility, exhaustion). 7 Toms, divorced from his first wife for several years, has just moved in with a woman who is also 33. She has 2 teenage children from a previous union. The family would have planted their spring potato crop last week if he hadnt been bedridden with a case of the flu. 8 When Eustacia, 47, saw her son slip off the cliff during a storm, she suffered a tremendous susto (fright illness) that did not go away even though he lived. Her husband was powerless to make her feel better and was worried her illness was so severe she might die from it. 26 Outside-class exercise: Creating a codebook All done? Your codebook should look something like this:

Optional: Take a brief tour of how to manage data in PSPP: 27 References Boas, Franz. Changes in the Bodily Forms of Descendants of Immigrants. American Anthropologist 14 (1912): 530-62. Gravlee, Clarence C., H. Russell Bernard, and William R. Leonard. Boas's Changes in Bodily Form: The Immigrant Study, Cranial Plasticity, and Boas's Physical Anthropology. American Anthropologist 105(2) (2003): 326-32. Gravlee, Clarence C. Ethnic Classification in Southeastern Puerto Rico: The Cultural Model of Color. Social Forces 83(3) (2005): 949-70. Gravlee, Clarence C. - Research. Accessed July 20, 2016. Harris, Marvin. Referential Ambiguity in the Calculus of Brazilian Racial Identity. Southwestern Journal of Anthropology 26(1) (1970): 114. Leopold, Robert. The Second Life of Ethnographic Fieldnotes. Ateliers du LESC 32 (2008). DOI: 10.4000/ateliers.3132 National Information Standards Organization (NISO). Understanding Metadata, Bethesda: NISO Press, 2004.

Ruel, Erin, William Edward Wagner III, and Brian Joseph Gillespie. Data Archiving. In The Practice of Survey Research: Theory and Applications, 305-12. London: SAGE Publications, 2015. Silver, Christina, and Ann Lewins. Using Software in Qualitative Analysis: A Step-by-Step Guide. London: SAGE Publications, 2014. 28 Acknowledgments Modules: Writers, Arienne M. Dwyer, Blenda Femenas, Lindsay Lloyd-Smith, Kathryn Oths, George H. Perry; Editor, Blenda Femenas Discussants: Workshop One, February 12, 2016: Andrew Asher, Candace Greene, Lori Jahnke, Jared Lyle, Stephanie Simms Workshop Two, May 13, 2016: Phillip Cash Cash, Jenny Cashman, Ricardo B. Contreras, Sara Gonzalez, Candace Greene, Christine Mallinson, Ricky Punzalan, Thurka Sangaramoorthy, Darlene Smucny, Natalie Underberg-Goode, Fatimah Williams Castro, Amber Wutich American Anthropological Association: Executive Director, Edward Liebow Project Manager, Blenda Femenas Research Assistant, Brittany Mistretta Executive Assistant, Dexter Allen

Professional Fellow, Daniel Ginsberg Web Services Administrator, Vernon Horn Director, Publishing, Janine Chiappa McKenna 15 29

Recently Viewed Presentations

  • REGNET Power Point Template

    REGNET Power Point Template

    Scope THE CONCEPT OF THEMES IN REGNET Theme = ? „matter about which is spoken, thought" broad interest "subject of an essay, paper, composition" related pieces "translation from your own language into a foreign language" languages REGNET's thematic approach The...
  • 9.1 Assessing a change in scale

    9.1 Assessing a change in scale

    Growth (or retrenchment) Growth is a common business objective. Retrenchment only occurs if there have been problems - internal or external. Businesses would like to announce 'rapid growth' or '20% rise in market share'.
  • Poverty, equity and health: Linking research and policy

    Poverty, equity and health: Linking research and policy

    Equity in health and health care: lessons from an Asian comparative study Eddy van Doorslaer Erasmus School of Economics & Erasmus Medical Centre Rotterdam Merck Foundation Lecture London School of Economics, 16 March 2007 Introduction - background Inequality and inequity...
  • Chauncy Awards Monday 20th Friday 24th January YEAR SEVEN ...

    Chauncy Awards Monday 20th Friday 24th January YEAR SEVEN ...

  • Yale ITS PowerPoint Template

    Yale ITS PowerPoint Template

    More APs Continue rollout of 802.11n Outdoor Coverage on Old Campus and Cross Campus 2GB to 10GB backhaul What's after that: 1-3 years Continue to add capacity Retire Yale Wireless at Medical school More Outdoor coverage: Multicast over wifi A...
  • TheEffects of DifferentLightSources on theMicrobialFloraof ...

    TheEffects of DifferentLightSources on theMicrobialFloraof ...

    Ash content (%) was determined by ashing at 550 ºC for 24 h. Fat content (%) was determined by using a Soxhlet fat extraction apparatus. For pH determination, the sample (10 g) was homogenized in 100 mL of distilled water...
  • Improving Cultural Competency for ... - Think Cultural Health

    Improving Cultural Competency for ... - Think Cultural Health

    Improving Cultural Competency for Behavioral Health Professionals. A free e-learning program developed by the HHS Office of Minority Health to help providers build knowledge and skills related to culturally and linguistically appropriate services
  • Goal Setting - Pulaski County High School

    Goal Setting - Pulaski County High School

    Figure 1: PCIS Three-Tier Model of Interventions. 1-5%. 1-5%. 5-10%. 5-10%. 80-90 % 80-90%. BEP . Program. Keep in mind the BEP was designed to be used within a PBIS Framework. It will be rendered less effective if it is...