FlowingData Forums » Statistics and Data

Digging Into Data Challenege

Started 3 years ago by nathany / 2 posts

  1. http://www.diggingintodata.org/ [via keyvowel]

    What is the "challenge" we speak of? The idea behind the Digging into Data Challenge is to answer the question "what do you do with a million books?" Or a million pages of newspaper? Or a million photographs of artwork? That is, how does the notion of scale affect humanities and social science research? Now that scholars have access to huge repositories of digitized data -- far more than they could read in a lifetime -- what does that mean for research?

  2. It means that research changes focus.

    The old focus was on discovering new facts. The new focus is on discovering patterns of facts.

    This is particularly notable in archaeology. If ever there was a science whose public image is on the literal discovery of new facts--in this case, arti"facts" buried within historical sites--archaeology would have to be it.

    But in the past decade or so, archaeologists have begun to put all their data into databases and to data-mine them. In the American Southwest, for example, there are now databases that are composed of the ages, locations, and many other parameters of thousands of dig sites and millions of objects from the American Indian cultures that previously lived there.

    And data mining is starting to provide answers to questions that were previously unapproachable. Who were these people? What were their geographical ranges? How did they trade goods between each other, often across vast distances, on foot? How did one society's art and technology affect its neighbors?

    By looking at patterns in the data, such as projectile points vs. knapping style vs. time, researchers can get a picture of where a particular technology originated, when it originated, and how it spread across the Southwest and Central America.

    Yes, discovering a new dig site can be exciting and revolutionary. But in another sense, discovering yet another dig site is kind of ho-hum. After you've already got a few million of them catalogued, digging up another few hundred arrowheads or broken pots or whatever becomes kind of meaningless. It's only when you can put that new data into context with all the rest and look at the patterns that emerge, can the data take on any real meaning.


Reply

You must log in to post.

About this Topic