Chapter 5: Making Your Data Portal Useful
We provide a number of suggestions for how to maximize the impact of the data portal to your campus community.
5.1 Dataset Acquisition
Not only does your team need to build out the data portal, but it also needs to acquire datasets to populate the portal. It is important to think about where to find these datasets, how to create pipelines to upload datasets, and lastly how to target high-value data. There are two pipelines to acquire datasets:
The group should understand the data ecosystem of the university and research answers to the following questions:
- Are datasets controlled by a central group or are they spread across various departments?
- Who has the final say when releasing datasets?
- Are there any data governance groups in the administration?
- Which datasets are legally authorized for public access according to FERPA and other privacy laws?
Administrative data should be more open to the public with the necessary caveats, as it is often the source by which university decisions are made; this makes it important for the public to be able to find biases and errors in the dataset. We recommend reaching out to your university’s institutional research and/or data governance group to start the search. Be sure to refer to our ‘Fostering a Relationship with University Administration’ section for more information.
Community actors. Students, student groups, student publications, student governments, and other various members also have datasets that are separate from the university administration. As such, it is critical to provide a pipeline for these actors to upload datasets on your portal. Be sure to refer to our ‘Data Governance’ section for information about how to create a process for uploading these datasets.
An open data portal group can serve as a liaison between the community and the administration. In the initial stages, it would be helpful to include a ‘Dataset request’ form to allow members of the community to request datasets that may not be on the portal. Later, the group could play a leading role in representing the student body for data requests. For example, the Stanford Open Data Project is working to create a community-wide Open Data Council, which would include representatives from key campus organizations like the student government, campus newspaper, and advocacy groups. Then, if a member proposed requesting a dataset, that request could be easily passed along to the organization that each member represents. This would allow for the rapid collecting and supporting of dataset requests, which could then be demonstrated to the administration. The Stanford team believes this would help facilitate greater participation from the student body in identifying and requesting datasets, as well as improving the student body’s standing when asking the administration.
In addition, hosting high-value data is a key step in maximizing your project’s value. High-value datasets might include data on diversity, finances, academic outcomes, crime data (e.g. often obtained through FOIA requests), survey data on the student body (although data quality is often an issue), and course rating (e.g. often not downloadable in machine-readable formats).
With this in mind, where should you start looking for these datasets? We have found the following steps to be helpful:
- Check the websites! A large number of datasets are available on various university-affiliated websites. The challenge is often locating these websites, rather than extracting the datasets, and members of the campus community may be able to point you toward the most helpful websites.
- Look for data related to high-visibility issues. For example, many campuses have vocal campaigns for fossil fuel divestment. If you know you’re looking for the university’s endowment information, you can ask specific questions to the right people.
- Figure out what campus activists need. Often, activism can be strengthened with data-driven insights. Additionally, activists on-the-ground might even know of specific datasets or sources that could be valuable. Work with them to identify, process, and analyze these datasets.
5.1.1 Importance of a Data Map
The data map is a useful organizational tool (e.g. spreadsheet) that will serve as a comprehensive listing of what data is available and what is not.
A data map should answer the following questions:
- Which datasets exist?
- Where can a user find these datasets?
- Which datasets do users have access to? Which datasets do users not have access to?
- Have requests already been made to access a particular dataset? What was their outcome?
5.2 Empower Data Analysis
In addition to centralizing datasets, an open data portal should make it easier to analyze these datasets. By implementing the following strategies, your team can not only increase data literacy across your university but also empower analysis of the datasets on the portal:
- Secure datasets that are in a machine-readable format e.g. .csv .xlsx
- Provide links to tutorials for data analysis
- Include links of data recipes (e.g. articles that have used this data; what processes were used to analyze)
- Depending on the scope of your team, hold workshops on data analysis that use the data portal
- Feature and promote data analysis outcomes
5.3 Generate Buy-In and Demand
Building an open data portal requires participation from university community members with regards to acquiring datasets, identifying users, and building a public facing product.
Here are some action items to consider as you begin to reach out to your campus community:
- Reach out to student government, student-run newspapers, and other groups as potential users, promoters of the data portal, and/or contributors to the open data portal. For example, the student newspaper could both use the data portal for writing articles but also market the data portal through their media. Student government and newspapers are particularly important due to the frequency at which they require data in their public-facing facing communication and articles respectively. These groups may also be able to contribute to the portal as they frequently collect and use their own datasets.
- Reach out to student activist and advocacy groups. Open data organizations can supercharge activist efforts by organizing key constituencies on campus to convince the administration to collect/release data on marginalized communities, e.g. disaggregating Asian American ethnicities or ethnic/gender makeup by major. There are many groups already working to hold the university accountable whose efforts would be bolstered by an aligned Open Data group.
- Promote the portal through computer science, engineering, and other relevant departments’ mailing lists. Students and members of these departments will find an open data portal useful for projects, classes, and more.
- Post on class Facebook groups, LinkedIn, and other relevant social media.
- Based on the scope of your open data project, throw events around the data portal such as data-thons, speakers, and workshops to help educate end-users to effectively use open data. This can generate interest in both contributing and using your portal as well as develop new ideas for your own team to work on.
- Connect with professors that teach classes that involve data analysis in any capacity. For example, at Northwestern, there is an Analytics for Social Good class and at Stanford there is the Data Challenge Lab. Working on data about your community can generate interest in learning about data analytics or statistics.
- Consider forming a formal inter-organizational data request council in order to make data requests to the university administration that will carry more weight. Universities are more likely to respond to requests when they have the weight of a broad coalition of students.
5.4 Fostering a Relationship with University Administration
A relationship with the administration expands the impact of your data portal. There are a variety of reasons that you should work closely with them, including:
- Legitimacy: Working with the university can make the project more credible, helping you get more data and support from other organizations. For example, having a website address ending in .edu signals legitimacy to an organization like the campus police department.
- Risk Mitigation: The university will likely raise good questions around data security and governance. Thinking through these questions, with input from university stakeholders, will help you mitigate risks.
- Impact: The university has significant resources and data, which you can tap into. Additionally, universities also have well-established outreach channels, which can help get your platform in the hands of more users across the campus.
- Sustainability: We’ve found that one of the biggest problems with student-led public interest technology is that when team members leave (e.g. graduation), the projects fall into disrepair. You can work with the long-term viability of the project by working with the university. For example, if you establish procedures so that the university automatically updates datasets on your portal, it’s more likely that the project will continue being useful.
These opportunities, however, will increase the administration’s workload. You should do as much of the heavy lifting as you can as reading this document, applying and relaying information from your data governance standards, and working through specific issues in accordance with their timelines, one at a time.
As you navigate administrative relations, you should select a point person on both ends, administration and the team, to facilitate engagement; prepare a pitch that demonstrates the value of an open data portal to the university; and if applicable, include your faculty advisor in your communication with the administration.
KEY IDEA
Speaking as people who have been through this before, we also want to mention to adjust your expectations for administrative hurdles as appropriate. It is important to understand that although we treat the administration as one entity in this document, it is filled up with departments with competing interests and differing opinions.If they need more concrete examples, there are resources citing the benefits of having an open data portal. Here are few to get you started:
- Improve your community: Data collected to benefit a community should be cited, transparent, and available to empower more of those data-driven decisions. Furthermore, access to an open data portal can inspire new civic-minded efforts.
- Increase collaboration: By having an open data portal, you can avoid duplicative work, increase data sharing internally and stimulate interdepartmental analysis.
- Alignment with administrative goals: Open data reduces data requests to administration and improves the quality of the student-administration relationship.
5.5 Long-Term Sustainability
Creating a successful open data project entails ensuring sustainability beyond just building a portal. Here are some strategies for achieving long term sustainability.
- Institutionalization: Creating a formal open data organization helps ensure continuity beyond your tenure. Having an executive board with designated roles and titles will also increase buy-in from leadership within specific areas and will also encourage effort from new members who see a pathway to organization leadership. The club could host data-thons and speakers that encourage use of the data portal.
- Recruitment: It’s important to have underclassmen who will be the future leaders of your organization. Have them join meetings with stakeholders to ease transition.
- Documentation: Recording your experiences/work thus far will allow you to preserve institutional knowledge beyond your tenure.
- Administration: creating relationships with the university administration will also provide both added legitimacy and accountability to deliver continuous, high-quality work.