Thursday, 31 March 2011

new manovich.

The Promises and the Challenges of Big Social Data

text author: Lev Manovich

article version: 1
posted March 31, 2011

[This is the first part of a longer article – the second part will be posted in the next few days]

The emergence of social media in the middle of 2000s created a radically new opportunity to study social and cultural processes and dynamics. For the first time, we can follow imagination, opinions, ideas, and feelings of hundreds of millions of people. We can see the images and the videos they create and comment on, eyes drop on the conversations they are engaged in, read their blog posts and tweets, navigate their maps, listen to their tracklists, and follow their trajectories in physical space.

In the 20th century, the study of the social and the cultural relied on two types of data: “surface data” about many (sociology, economics, political science) and “deep data” about a few (psychology, psychoanalysis, anthropology, ethnography, art history; methods such as “thick description” and “close reading”). For example, a sociologist worked with census data that covered most of the country’s citizen; however, this data was collected only every 10 year and it represented each individual only on a “macro” level, living out her/his opinions, feelings, tastes, moods, and motivations. In contrast, a psychologist was engaged with a single patient for years, tracking and interpreting exactly the kind of data which census did not capture.

In the middle between these two methodologies of “surface data” and “deep data” was statistics and the concept of sampling. By carefully choosing her sample, a researcher could expand certain types of data about the few into the knowledge about the many. For example, starting in 1950s, Nielsen Company collected TV viewing data in a sample of American homes (via diaries and special devices connected to TV sets in 25,000 homes), and then used this sample data to tell TV networks their ratings for the whole country for a particular show (i.e. percentage of the population which watched this show). But the use of samples to learn about larger populations had many limitations.

For instance, in the example of Nelson’s TV ratings, the small sample used did not tell us anything about the actual hour by hour, day to day patterns of TV viewing of every individual or every family outside of this sample. Maybe certain people watched only news the whole day; others only tuned in to concerts; others had TV on not never paid attention to it; still others happen to prefer the shows which got very low ratings by the sample group; and so on. The sample data could not tell any of this. It was also possible that a particular TV program would get zero shares because nobody in the sample audience happened to watch it – and in fact, this happened more than once.

Think of what happens then you take a low-res image and make it many times bigger. For example, lets say you stat with 10x10 pixel image (100 pixels in total) and resize it to 1000x1000 (one million pixels in total). You don’t get any new details – only larger pixels. This is exactly happens when you use a small sample to predict the behavior of a much larger population. A “pixel” which originally represented one person comes to represent 1000 people who all assumed to behave in exactly the same way.

The rise of social media along with the progress in computational tools that can process massive amounts of data makes possible a fundamentally new approach for the study of human beings and society. We no longer have to choose between data size and data depth. We can study exact trajectories formed by billions of cultural expressions, experiences, texts, and links. The detailed knowledge and insights that before can only be reached about a few can now be reached about many – very, very many.

In 2007, Bruno Latour summarized these developments as follows: “The precise forces that mould our subjectivities and the precise characters that furnish our imaginations are all open to inquiries by the social sciences. It is as if the inner workings of private worlds have been pried open because their inputs and outputs have become thoroughly traceable.” (Bruno Latour, “Beware, your imagination leaves digital traces”, Times Higher Education Literary Supplement, April 6, 2007.)

Two years earlier, in 2005, Nathan Eagle at MIT Media Lab already was thinking along the similar lines. He and Alex Pentland put up a web site “reality mining” ( and wrote how the new possibilities of capturing details of peoples’ daily behavior and communication via mobile phones can create “Sociology in the 21st century.” To put this idea into practice, they distributed Nokia phones with special software to 100 MIT students who then used these phones for 9 months – which generated approximately 60 years of “continuous data on daily human behavior”.

Finally, think of Google search. Google’s algorithms analyze text on all web pages they can find, plus “PDF, Word documents, Excel spreadsheets, Flash SWF, plain text files, and so on,” and, since 2009, Facebook and Twitter content. ( Currently Google does not offer any product that would allow a user to analyze patterns in all this data the way Google Trends does with search queries and Google’s Ngram Viewer does with digitized books – but it is certainly technologically conceivable. Imagine being able to study the collective intellectual space of the whole planet, seeing how ideas emerge and diffuse, burst and die, how they get linked together, and so on – across the data set estimated to contain at least 14.55 billion pages (as of March 31, 2011; see

Does all this sounds exiting? It certainly does. What maybe wrong with these arguments? Quite a few things.

[The second part of the article will be posted here within next few days.]

I am grateful to UCSD faculty member James Fowler for an inspiring conversation a few years about the depth/surface questions. See his pioneering social science research at

Wednesday, 30 March 2011

me and my sister valia rotoscoped

one of the images that are stuck in my head, village time.

me and mom rotoscoped

I will choose to omit certain aspects of the photographs while rotoscoping, in this case, the rest of the dress and my legs.

5 years old at gefinor

Mixing and matching memories to places and bits of objects (at a later stage), i rotoscoped myself against the gefinor location (that I shot few days ago, and that still needs to be shot again). Every element will be saved as a different file, so that location and character or object, never fall under the same set.

Monday, 28 March 2011

the future of the book

Again, rula, thanks for sharing!

Lev Manovich. Data Visualization. 2009. 3/3

Lev Manovich. Data Visualization. 2009. 2/3

Lev Manovich: Data Visualization. 2009. 1/3

Dot. The world's smallest stop-motion animation character shot on a Nokia N8

Not my style, but it's worth being watched.

"Professor Fletcher's invention of the CellScope, which is a Nokia device with a microscope attachment, was the inspiration for a teeny-tiny film created by Sumo Science at Aardman. It stars a 9mm girl called Dot as she struggles through a microscopic world. All the minuscule detail was shot using CellScope technology and a Nokia N8, with its 12 megapixel camera and Carl Zeiss optics."

her morning elegance - another stop motion

A dear friend shared this video.

post-it stop motion

Some inspiring practices.

Saturday, 26 March 2011

Tarkovsky's notes

My friend gave me Tarkovsky's "Sculpting in Time" as a birthday gift, I was going through the introduction to get back to the book at a later stage when I found myself reading more and more in it. I reached this part where he speaks about non-linearity, poetry in cinema and life portraying... the note on perceiving someone on the street and how this moment is remembered later on made me think about the blur in image making, contour-less objects or sceneries... I am too lazy to rewrite the whole section, I will just attach photos of the pages in here.

landscape as objects sculptures

On another note, in the same book as post below, I found this sculpture by Rachel Whiteread, "House", 1993, matériel de construction et plâtre, hauteur approx. 10 m, aujourd'hui détruite. I have seen Rachel's work earlier last year at Tate Britain, this is my first encounter with this project. Why I find this interesting is how she managed to create an object from a house, which in a way explores an aspect I am searching into, the composing a universe from already existing universes. The photo's quality is bad, I took the shot in a café, the light was dim and I only had my iPhone's camera on me.

objects as sculptures

I was skimming through the art book, "Art Now", and in the back of my mind I was hoping to find object related works. I found this sculpture by David Mach, "The Bike Stops Here", 1989, 134 x 134 x 81 cm, tête de cerf et bicylette. The main trick is the bicycle's steering that become the deer's horn, which I find a witting way of composing a universe, the latter being one my concerns lately for the project.

Thursday, 24 March 2011

location: GEFINOR - possible pano - 1 angle - vertical

I have to look at how to homogenize the sky when shooting, otherwise, this could be easily fixed in photoshop.

location: GEFINOR - extra shots - upper view

location: GEFINOR - a test on the canon stitcher software

So I am not using a software that enables the user to modify a lot, this is just a sketch of a possible panorama of the GEFINOR location. My ambition is to be able to create a new universe, that remains loyal to original locations, yet composited to fit my requests. So another layer should be added in here, the photograph of me and my sister sitting on a small pool in the village, this latter will be rotoscoped I think.

location: GEFINOR

This is one of the locations that I am sure about, this location is where my mom used to work and one of the landmarks in Beirut, also, I have spent a lot of time in my childhood there, mainly either accompanying my mother to work on Saturdays (Banque Libano Française) or to go to my general pediatrician in the block facing the bank, this location will definitely be included in one of the loops. While I am still reading the manual of my new camera (!!), these tests have been shot using canon eos t3i, basic mode zone: scene itelligent auto.
Some of the selected test shots:

for the chart animations

I still have to set my priorities in terms of what goes in the project and what doesn't. Everytime I think that I did go through this process, it turns out that the final outcome cannot be that varied in term scenes selected.

I was browsing for tutorials, for the chart animation, I found this tutorial by Mylenium that shows how to create a fully animatable diagram with a few expressions and some thinking around the corner. "While there are other techniques to create such graphs manually based upon Masks or the Write-On effect, this one has the advantage of being quite flexible and allowing last minute changes once you have the basic template. It cannot save you from buying dedicated software if you need things such as XML import or more variations of the look, however. In order to maintain maximum control over the look and avoid overloading one comp with too many layers, we are going to use several pre-compositions and cross-reference them with each other."

the final outcome should look like this, but with playing around and modifying the angles, i might be able to get rid of the slight angle displayed in the final outcome.

the most typical face on the planet - compositing images

I would have love to watch the process of compositing the image, it doesn't show in this short clip.

Briefly, this video is about the most typical face on the planet.

Here's what the article says:

"National Geographic Magazine released a video clip, below, showing the most "typical" human face on the planet as part of its series on the human racecalled "Population 7 billion."

The researchers conclude that a male, 28-year-old Han Chinese man is the most typical person on the planet. There are 9 million of them.

The image above is a composite of nearly 200,000 photos of men who fit that description.

Don't get used to the results, however. Within 20 years, the most typical person will reside in India."

You can check out the video below:

Tuesday, 15 March 2011

on the way home - 2 visuals 2 different nights

Walking my way home, 2 days ago, the snail appeared on the pavement, 4 nights earlier, I have crossed a "you are not alone" fresh graffiti! The pictures are taken with the iphone 4 camera, thus the not so great lighting.

The debate about psychogeography

I was reading Geoff Nicholson's "The lost art of walking", to get ideas about walking and filming for the project, I found this part very relevant, he is discussing "psychogeography" of the Situationists, for me this makes great sense.

Thursday, 10 March 2011

another good article: Graphing Culture by James Williford

"The problem with the humanities,” Lev Manovich told me over a quick meal at a strip-mall sushi joint in La Jolla last January, “is that people tend to worry too much about what can’t be done, about mistakes, problems, as opposed to just going and doing something.”

It was almost 8:00 p.m. Manovich, a professor of visual arts at UC–San Diego, had already spent the better part of the day in faculty meetings, led a class of undergrads through a nearly three-hour session of Time- and Process-Based Digital Media II, attended an informational event hosted by Google, and caught up on the progress of several of his graduate and postdoctoral researchers. In a few minutes, he’d be on his way home to put the finishing touches on a work of video art that needed to be installed at a friend’s gallery in downtown San Diego the next day. Inertia, I thought—of the intellectual variety or any other—is simply not a part of this guy’s constitution. “Of course, visualizations can’t represent everything,” he continued. “Of course, there are limits, but let’s not spend all of our time talking about it—let’s go ahead and do it, let’s figure out what we can do, right?”

Lev Manovich in front of the HIPerSpace.
—Photo courtesy of Calit2

Manovich was expounding the merits of “cultural analytics”—which he inaugurated in a 2009 essay—“a new paradigm for the study, teaching, and public presentation of cultural artifacts, dynamics, and flows.” Inspired by both the explosion of media on the Internet in recent years (YouTube, Flickr, ARTstor) and the increasingly interactive nature of our everyday media experiences (browsing the Web, playing computer games, manipulating images in Photoshop), the general idea of cultural analytics is to apply data visualization and analysis techniques traditionally associated with the so-called hard sciences—graphing, mapping, diagramming, and so on—to the study of visual culture. The difference between Manovich’s essay and so many other attempts to outline potential intersections between new media and the humanities is that it was more of a report to the academy than a mere call to action.

Operating under the banner of the Software Studies Initiative, Manovich and a handful of other scholars at UCSD’s Center for Research in Computing and the Arts had already spent two years asking and taking practical steps toward answering such questions as, What can one do with the vast archives of cultural material now available, as we say, at the click of a button? Where might one begin a discussion of the aesthetic properties of the millions of user-generated videos posted online? How does one capture and summarize the dynamics of sixty-plus hours of video-game play?

The entire enterprise, they discovered early on, was not so straightforward as feeding new kinds of datasets into existing software systems and interpreting the results they spit out. As Manovich explained, there are too many assumptions and predetermined pigeonholes built into most scientific visualization technologies. “So, for example, you have some type of medical imaging technique that you use for distinguishing healthy cells from cancerous ones.” A great thing, of course, if you happen to be an oncologist or an oncologist’s patient. “But you don’t want to divide culture into a few small categories,” he said. “What’s interesting about culture is that the categories are continuous. Instead of using these techniques to reduce complexity, to divide data into a few categories, I want to map the complexity.”

Jeremy Douglass (of UCSD’s Software Studies Initiative) and Florian Wiencek (of Jacobs University, Bremen) explore an image set of over one million pages of manga. The visualization, which draws on 883 different titles, raises questions about the relations between genre, visual style, and target audience.
—Photo Courtesy of the Software Studies Initiative

Among a variety of approaches to cultural analytics that the Software Studies team has developed since 2007 is a new (though, Manovich was careful to point out, not entirely unprecedented) kind of visualization that they call “direct” or “non-reductive.” Unlike most visualizations, which use points, lines, and other graphical primitives to represent data by abstraction, direct visualizations use images of the actual cultural objects from which the data was derived in the first place. In other words, instead of creating a standard point-based scatter plot reflecting, say, the brightness of Mark Rothko’s paintings over time, a direct visualization of that dataset will show the same pattern by distributing images of the paintings themselves across the graph space. Beyond its obvious visual appeal, the method offers a number of practical advantages to the humanities scholar. Seeing the pattern in its original context “should allow you to notice all kinds of other patterns that are not necessarily represented in your measurements,” said Manovich. It also allows you to move quickly between close reading—focusing on a single Rothko painting—and what literary historian Franco Moretti has termed “distant reading”—viewing a whole set of Rothkos at once. And with a large enough collection of data, Manovich added, “you might even discover other ‘zoom levels’ from which you can look at culture that may not correspond to a book, an author, a group of authors, a historical period, but are equally interesting. You can slice through cultural data diagonally in all kinds of ways.”

The day before, Manovich and Jeremy Douglass, a Software Studies postdoc with a background in literary theory and game studies, had shown me around some of the team’s recent data-slicing projects. They took me to the second floor of UCSD’s Atkinson Hall, where, at the edge of a large, cubicled workspace, the university keeps one of the highest resolution display systems in the world—the 238.5 square foot, 286.7 megapixel Highly Interactive Parallelized Display Space, or HIPerSpace, for short.

The first visualization they loaded was one of their simplest, but also most striking: a montage of every Time magazine cover published between March 3, 1923, (the first issue) and September 14, 2009—4,535 images in all. Laid out chronologically, beginning at the upper lefthand corner with a cover featuring Illinois Congressman Joseph G. Cannon and ending at the lower right with Jay Leno, the series immediately reveals certain patterns in magazine’s stylistic evolution: the shrinking of a decorative white border, the gradual transition from black-and-white to color printing, long periods in which certain hues come to dominate, and so on. From a distance, it’s a bit like looking at the annual growth rings of some felled ancient tree—except that instead of simply indicating a history of local climate conditions, the patterns here raise questions about the broader milieu in which the changes took place, about the nature of visual style, and perhaps even about the very idea of historical patterns. Visualization “makes the job of our visual system easier,” said Manovich, “but it’s not going to explain a pattern. It confronts you with something you wouldn’t notice otherwise, confronts you with new cultural facts. You see things that, probably, nobody has noticed before, new cultural patterns that you now have to explain.”

Another, slightly more sophisticated, approach to the same dataset—distributing the Time covers horizontally by date of publication and vertically according to relative color saturation levels (basically, vividness)—brought to light a number of “outliers,” data points that do not conform to the general pattern or range of the set as a whole. These, Douglass suggested, can help draw you into traditional close readings, but from angles that you might not have anticipated. “One cover out of forty-five hundred, a cover that might not have been significant to you if you were just searching indexes of Time or thinking about a particular topic, suddenly becomes significant in the context of a large historical or design system. It’s not that you would have known that it was important ahead of time—it’s not like, ‘Oh, this is the Finnegans Wake of Time covers’—it’s that, contextually, it’s significant. And when I dive into the visualization, I can actually see what these extremely saturated covers depict.”

A visualization created by undergraduate students in UCSD’s visual art program.
—Photo courtesy of the Software Studies Initiative.

As it turned out—and this seemed to come as a surprise to Manovich and Douglass—quite a few of those saturated outliers were covers that dealt in one way or another with communism. “Pure red, pure binary enemy, right?” laughed Manovich, who grew up in the Soviet Union during what he described as “the last stages of a decaying so-called Communist society.”

“You can think of each of these visualization techniques as a different photograph of your data,” said Manovich, “as photographs of your data taken from different points of view. If I were taking a photograph of Eduardo”—Eduardo Navas, another Software Studies researcher, who had joined us for the HIPerSpace demonstration—“I could take it from the front or from the side. I will notice in both photographs some of the same patterns, some of the same proportions, but each photograph will also give me access to particular information, each point of view will give me additional insights.”

When Douglass called up a visualization from another project, one in which the team analyzed and, in various ways, graphed over one million pages of manga (Japanese comics), Manovich remarked, “This is kind of our pièce de résistance.” It was easy to see why. Technically, the x-axis reflects standard deviation, and the y-axis, entropy—a configuration that results in the most detailed of the scanned pages, the ones with the most pictorial elements and intricate textures, appearing along the upper right curve of the graph, while the simplest, those dominated by either black or white space, trail off toward the lower left. But what most impressed me was the strangely immersive aesthetic experience it produced. What at first looked to me like a bell-shaped white blob set against a grey background, resolved, as Douglass zoomed in and around the visualization, into a complex field of truly startling density—pages from different comics, representing different genres, drawn by different artists, so crowded together that they overlap, seeming almost to compete with one another over the available space. This, it occurred to me, is about as close as I’ll ever get to stepping into one of Andreas Gursky’s photographs.

The team is aware of the artistic side of their work. In fact, their visualizations have been shown at a number of art and design venues, including the Graphic Design Museum in the Netherlands, the Gwangju Biennale in South Korea, and, last fall, at one of UCSD’s own exhibition spaces, the gallery@calit2. “For many people who enjoyed the show” at UCSD, Douglass said, “it was just about the visualization as a kind of spectacle and object of desire. They may not care about manga at all, may have no desire to read manga; but the idea that manga has a shape, or the idea that it’s all in one place—it’s a dream of flying over a landscape. It’s not about wanting to live there, it’s just the fact that you’re flying that’s so compelling.

“But,” he added, “that’s not my relationship to what I’m doing. I don’t spend half my time trying to make technical illustrations and half my time trying to create beautiful sculptures. I just move back and forth seamlessly through the space, and often don’t worry about which one I’m doing. I’ve noticed that some people will say, ‘That’s not an information visualization, that’s an artwork,’ and some people will say, ‘This is totally schematic, you did this procedurally and it’s just informative.’ But people who don’t have that kind of disciplinary anxiety just say, ‘That’s beautiful and interesting.’”

“Part of what I am trying to do,” Manovich said, “is to find visual forms in datasets which do not simply reveal patterns, but reveal patterns with some kind of connotational or additional meanings that correspond to the data. But partly, with something like the Time covers, I’m also trying to create an artistic image of history.”

James Williford is an editorial assistant at HUMANITIES and a graduate student at Georgetown University.

The Software Studies Initiative has received two grants from NEH. The first, a Humanities High-Performance Computing grant of $7,969, was used to analyze the visual properties of various cultural datasets at the National Energy Research Scientific Computing Center. The Software Studies team is currently using a Digital Humanities Start-Up grant of $50,000 to develop a version of their visualization software that will run on PC and Mac computers.