Abstract
As cities throughout the developing world grow, they often expand more quickly than the infrastructure and service delivery networks that provide residents with basic necessities such as water and public safety. Why do some cities deliver more effective infrastructure and services in the face of rapid growth than others? Why do some households and communities secure better services than others? Answering these questions requires studying the large, politicized bureaucracies charged with providing urban services, especially the relationships between frontline workers, agency managers, and citizens in informal settlements. Researchers investigating public service delivery in cities of the Global South, however, have faced acute data scarcity when addressing these themes. The recent emergence of crowd-sourced data offers researchers new means of addressing such questions. In this paper, we draw on our own research on the politics of urban water delivery in India to highlight new types of analysis that are possible using crowd-sourced data and propose solutions to common pitfalls associated with analyzing it. These insights should be of use for researchers working on a broad range of topics in comparative politics where crowd-sourced data could provide leverage, such as protest politics, conflict processes, public opinion, and law and order.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Notes
As Scott (1996, pp. 74–75) famously noted, such information asymmetries can arise from experiential knowledge (“mētis”) as well as technical skills.
See World Bank (2016, Chapter 3) for a more extensive catalog of such initiatives.
While water meters can tabulate flow through specific pipes and connections, these are typically read manually at regular intervals and thus do not give utilities real-time information on how often and when water is delivered to particular areas.
NextDrop’s revenue model involved charging utilities for information services, including real-time information of water flows.
NextDrop was started by a group of U.C. Berkeley engineering graduates, among others.
The emphasis on assessing the accuracy of a principle source of remotely collected data thus differs subtly from triangulation, usually defined as inference based on multiple sources of evidence, such that “diverse viewpoints cast light upon a topic” (Olsen 2004). Groundtruthing, in contrast, focuses on data validation rather than inference.
Only a small percentage of social media data is geo-referenced, because this requires obtaining user consent or extracting location information from posted messages using automated text analysis. For example, approximately 25% of Tweets are geo-tagged (Bryant 2010; DuVander 2010). On the general point of selection bias in crowd-sourced data, see Mayer-Schönberger and Cukier (2013) and Offenhuber (2017, p. 169).
Van der Windt and Humphreys (2014), for example, compare conflict data sourced electronically from observers with survey data.
See Hyun et al. (2018) for more detail.
Lawrence (2017), for example, constructed a systematic sample of “first mover” protesters and potential protesters in Morocco. This in turn allowed her to recruit participants for a Facebook survey experiment from a network of activists. Van der Windt and Humphreys (2014) provided a set of individuals in randomly selected villages in the Democratic Republic of Congo with mobile phones and training in reporting conflict events.
Our research focused on nine valvemen in one of the utility’s 32 subdivisions. They were shadowed for approximately 4 months in total.
Observation of this sort can, of course, suffer from the Hawthorne effect. In our case, the danger would be that valvemen would be more likely to report as expected when observed. However, this made observations of divergence from expectations in our presence particularly informative.
Hyun et al. (2018) provides this analysis.
A fuller discussion of the use of incentivizing data contributions appears below.
Details in this paragraph are drawn from Kumar et al. (2017).
Van der Windt and Humphreys (2014) also utilize qualitative groundtruthing, in their case to assess the accuracy of reports of conflict. The authors sent field coordinators to verify the quality of their “crowdseeded” conflict data from the Democratic Republic of Congo: coordinators observed whether or not contributors understood coding schemes and assessed the accuracy of reporting. The paper, unfortunately, does not provide detail on the types of qualitative research methods used to assess data accuracy.
Centers facilitating such collaborations include the Social Media and Political Participation Laboratory at New York University (http://smapp.nyu.edu/about.html), the Center for Information Technology Research in the Interest of Society at University of California at Berkeley (http://citris-uc.org/about-citris/), and the Media Cloud at Harvard and MIT (http://mediacloud.org/).
References
Anand N. Municipal disconnect: on abject water and its urban infrastructures. Ethnography. 2012;13(4):487–509.
Auerbach A. Clients and communities: the political economy of party network organization and development in India’s urban slums. World Politics. 2016;68(1):111–48.
Bailard CS. A field experiment on the internet’s effect in an African election: savvier citizens, disaffected voters, or both? J Commun. 2012;62(2):330–44.
Baird S, Bohren JA, McIntosh C, Ozler B. Designing experiments to measure spillover effects, second version. 2015. Retrieved from https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2619724.
Barberá P. The 2013 Italian parliamentary elections on Twitter. 2013.
Barberá P. Birds of the same feather tweet together: Bayesian ideal point estimation using Twitter data. Polit Anal. 2015;23(1):76–91.
Barberá P, Metzger M. A breakout role for Twitter? The role of social media in the Turkish Protests (social media and political participation lab data report). Social Media and Political Participation Lab; 2013.
Björkman L. Pipe politics, contested waters: embedded infrastructures of millennial Mumbai. Durham: Duke University Press; 2015.
Bond R, Messing S. Quantifying social media’s political space: estimating ideology from publicly revealed preferences on Facebook. Am Polit Sci Rev. 2015;109(01):62–78.
Boutet A, Kim H, Yoneki E. What’s in your tweets? I know who you supported in the UK 2010 general election. ICWSM; 2012.
Breuer A, Landman T, Farquhar D. Social media and protest mobilization: evidence from the Tunisian revolution (SSRN Scholarly Paper No. ID 2133897). Rochester, NY: Social Science Research Network. 2012. Retrieved from http://papers.ssrn.com/abstract=2133897.
Bryant M. Twitter geo-fail? Only 0.23% of tweets geotagged. 2010. Retrieved April 20, 2015, from http://thenextweb.com/2010/01/15/twitter-geofail-023-tweets-geotagged/.
Bussell J. Serving clients and constituents: experimental evidence on political responsiveness. 2017.
Calvo E. Anatomia Politica de Twitter en Argentina. Buenos Aires: Capital Intellectual; 2015.
Carlson M, Jakli L, Linos K. Rumors and refugees: how government-created information vacuums undermine effective crisis management. Int Stud Q. Forthcoming.
Ching A, Zegras C, Kennedy S, Mamun M. A user-flocksourced bus experiment in Dhaka: new data collection technique with smartphones. Transportation Research Record: Journal of the Transportation Research Board. 2013. Retrieved from http://web.mit.edu/czegras/www/Flocksource_JUT.pdf.
DuVander A. 3 reasons geocoded tweets haven’t caught on and 2 reasons not to worry. 2010. Retrieved April 20, 2015, from http://www.programmableweb.com/news/3-reasons-geocoded-tweets-havent-caught-and-2-reasons-not-to-worry/2010/02/17.
Estellés-Arolas E, González-Ladrón-de-Guevara F. Towards an integrated crowdsourcing definition. J Inf Sci. 2012;38(2):189–200.
Furtado V, Caminha C, Ayres L, Santos H. Open government and citizen participation in law enforcement via crowd mapping. IEEE Intell Syst. 2012;27(4):63–9.
Gerber AS, Green DP. Field experiments: design, analysis, and interpretation. W. W Norton; 2012.
Grossman G, Humphreys M, Sacramone-Lutz G. “I wld like u WMP to extend electricity 2 our village”: on information technology and interest articulation. Am Polit Sci Rev. 2014;108(03):688–705.
Hargrave ML. Ground truthing the results of geophysical surveys. In: Johnson JK, Giardano M, Kvamme KL, Clay RB, Green TJ, Dalan RA, editors. Remote sensing in archaeology. Tuscaloosa: University of Alabama Press; 2006. p. 269–303.
Hyun C, Post AE, Ray I. Frontline worker compliance with transparency reforms: barriers posed by family and financial responsibilities. Governance. 2018;31:65–83.
Iliffe M, Sollazzo G, Morley J, Houghton R. Taarifa: improving public service provision in the developing world through a crowd-sourced location based reporting application. OSGeo J. 2014;13(1):34–40.
Jamal A, Keohane R, Romney D, Tingley D. Anti-Americanism or anti-interventionism in Arabic Twitter discourses. Perspect Polit. 2015;13(1):55–73.
Jha S, Rao V, Woolcock M. Governance in the gullies. World Dev. 2007;35(2):230–46.
Klopp J, Mutua J, Orwa D, Waiganjo P, White A, Williams S. Towards a standard for paratransit data: lessons from developing GTFS data for Nairobi’s Matatu System. Presented at the Transportation Research Board 93rd Annual Meeting. 2014. Retrieved from http://trid.trb.org/view.aspx?id=1289853.
Kruks-Wisner G. Seeking the local state: gender, caste, and the pursuit of public services in post-tsunami India. World Dev. 2011;39(7):1143–54.
Kruks-Wisner G. The pursuit of social welfare: citizen claim-making in rural India. World Politics. 2018;70(1):122–63. https://doi.org/10.1017/S0043887117000193.
Kumar T, Post AE, Ray I. Flows, leaks, and blockages in informational interventions: a field experimental study of Bangalore’s water sector. World Development. 2018;106:149–60.
Lawrence AK. Repression and activism among the Arab Spring’s first movers: evidence from Morocco’s February 20th movement. Br J Polit Sci. 2017;47(3):699–718.
Lipsky M. Street-level bureaucracy: dilemmas of the individual in public service. New York: Russell Sage Foundation; 1980.
Mayer-Schönberger V, Cukier K. Big data: a revolution that will transform how we live, work, and think. New York: Houghton Mifflin Harcourt; 2013.
Offenhuber D. Waste is information: infrastructure legibility and governance. Cambridge: The MIT Press; 2017.
Olsen W. Triangulation in social research: qualitative and quantitative methods can really be mixed. In Holborn M, editors. Developments in sociology. Causeway Press; 2004.
Parikh T. Digital data collection for improving service delivery: a framework for decision-makers. 2015.
Peixoto T, Fox J. When does ICT-enabled citizen voice lead to government responsiveness. 2016 World Development Report Background Paper, (Internet for Development). 2015.
Post AE, Bronsoler V, Salman L. Hybrid regimes for local public goods provision: a framework for analysis. Perspect Polit. 2017;15(4):952–66.
Poushter J. Smartphone ownership and internet usage continues to climb in emerging economies. 2016. Retrieved June 12, 2017, from http://www.pewglobal.org/2016/02/22/smartphone-ownership-and-internet-usage-continues-to-climb-in-emerging-economies/.
Raza D. ‘I saw it on WhatsApp’: why people believe hoaxes on the messaging app. 2017. Retrieved June 12, 2017, from http://www.hindustantimes.com/i-saw-it-on-whatsapp-why-people-believe-hoaxes-on-the-messaging-app/story-TTAJjgLC7eL2Mb0LNxVGuJ.html.
Sachdev C. If you see dirty water, don’t just gripe. Talk to the cloud! [National Public Radio]. 2017. Retrieved June 9, 2017, from http://www.npr.org/sections/goatsandsoda/2017/06/07/527898124/if-you-see-dirty-water-dont-just-gripe-talk-to-the-cloud.
Scott JC. State simplifications: nature, space, and people. Nomos. 1996;38:42–85.
Starbird K, Palen L. (How) will the revolution be retweeted?: information diffusion and the 2011 Egyptian uprising. In Proceedings of the ACM 2012 Conference on Computer Supported Cooperative Work. New York, NY, USA: ACM; 2012. p. 7–16.
Story M, Congalton RG. Accuracy assessment: a user’s perspective. Photogramm Eng Remote Sens. 1986;52(3):397–9.
Telecom Regulatory Authority of India. The Indian telecom services performance indicator report. New Delhi, India. 2017. Retrieved from http://www.trai.gov.in/.
Touchton M, Wampler B. Improving social well-being through new democratic institutions. Comp Pol Stud. 2014;47(10):1442–69. https://doi.org/10.1177/0010414013512601.
UNICEF. U-report application revolutionizes social mobilization, empowering Ugandan youth. 2012a. Retrieved April 16, 2015, from http://www.unicef.org/infobycountry/uganda_62001.html.
UNICEF). TIME magazine covers UNICEF supported mTrac system in Uganda. 2012b. Retrieved April 16, 2015, from http://unicefstories.org/2012/08/16/time-magazine-covers-unicef-supported-mtrac-system-in-uganda/.
Vaccari C, Valeriani A, Barberá P, Bonneau R, Jost JT, Nagler J, et al. Political expression and action on social media: exploring the relationship between lower- and higher-threshold political activities among Twitter users in Italy. J Comput-Mediat Commun. 2015;20(2):221–39.
van den Berg C, Danilenko A. The IBNET water supply and sanitation performance blue book. Washington D.C.: The World Bank; 2011.
van der Windt P, Humphreys M. Crowdseeding conflict data. 2014.
World Bank. Digital dividends: world development report 2016. Washington D.C.: The World Bank Group; 2016.
Yadav, T. author has posted comments on this articleAnkit. 2015. Tension grips Rithora over objectionable WhatsApp post. Retrieved October 16, 2015, from http://timesofindia.indiatimes.com/city/bareilly/Tension-grips-Rithora-over-objectionable-WhatsApp-post/articleshow/47389539.cms.
Acknowledgments
This research was funded by a “DIL Innovate” Grant from the Development Impact Laboratory, Blum Center for Developing Economies (USAID Cooperative Agreement AID-OAA-A-13-00002, Alison Post and Isha Ray Principal Investigators), U.C. Berkeley, and a dissertation fieldwork grant from the Institute for International Studies, U.C. Berkeley. Tanu Kumar and Isha Ray, U.C. Berkeley, are co-authors of the impact evaluation project described in this paper. We thank Maria Chang for research assistance. We also thank NextDrop, the Public Affairs Foundation, and the Bangalore Water Supply and Sewerage Board (BWSSB) for their support of our research. We are grateful for comments from Thad Dunning, Agustina Giraudy, Tanu Kumar, Katerina Linos, Aila Matanock, Isha Ray, and seminar participants at U.C. Berkeley and American University.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Post, A.E., Agnihotri, A. & Hyun, C. Using Crowd-Sourced Data to Study Public Services: Lessons from Urban India. St Comp Int Dev 53, 324–342 (2018). https://doi.org/10.1007/s12116-018-9271-4
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12116-018-9271-4