what is this?

with all the recent talk about dotcom booms and busts and all the really general statements people are making; i was curious as to what's really going on now, what went on four years ago and what effect did the boom really have on the san francisco bay area.

this is an interactive graph of activity that has occured over the past four years on craigslist for the san francisco bay area. craigslist is an extremely popular online community site based out of san francisco. it can be thought of as an online classifieds board that is widely accepted in the bay area as the de facto place to find out about housing, jobs, community events and pretty much anything else you might find in the classifieds section of a regional newspaper. this project graphs activity in the various categories on craigslist between march of 1998 and august of 2001.

this graph was built by harvesting the data from the egroups archive of the craigslist email list. the data was then cleaned up and tabulated by a few perl scripts. the graphs are generated by gnuplot which is executed by the webapp which was also written in perl.

it's fairly interesting as you can basically see some somewhat hard data on what the san francisco bay area has been doing over the past four years. you can graph job postings from various industries against things like apartments for rent or housing wanted postings, also for sale postings and resumes.

some of the interesting things i found is that the number of housing wanted postings seems to be slightly down recently but pretty much unaffected by the drop in jobs. perhaps people are always in a state of wanting to move to a city.

one thing to bear in mind, this data isn't one hundred percent accurate for a few reasons... people do repost their information multiple times in a given month, sometimes people post a number of available apartments in a single posting and also craigslist has become significantly more popular over the years in question. however, i do still think that it makes for a decent general indicator of trends in our community. (i'm considering doing some kind of normalizing based on the total volume of posts- i need to think about it some more)

also bear in mind that some categories on craigslist were both created and removed during the time period represented. these categories are graphed as they appear. so if you see a line for a particular category start or terminate in the middle of the timeline, this is why. also, if you select one of these limited categories, and graph it solo the timeperiod on the graph will be limited to the length of the category. for best results, select multiple categories.

if there's enough interest, i may expand this site to track craigslist in realtime and possibly provide some additional views for the data. (stuff like moving averages and the like) if you have any ideas regarding different views for the data or any suggestions, please drop me a line.

either way, the data is there, you can look at it for different time periods and categories. draw your own conclusions and have fun!


back