91制片厂视频

Data

鈥楤ig Data鈥 Research Effort Faces Student-Privacy Questions

By Benjamin Herold 鈥 October 21, 2014 8 min read
Ken Koedinger, a professor of human-computer interaction and psychology at Carnegie Mellon University, leads researchers building a database to analyze learning and behavioral information that students generate when they use digital-learning tools.
  • Save to favorites
  • Print
Email Copy URL

A coalition of prominent research universities is receiving federal support to redesign and scale up a massive repository for storing, sharing, and analyzing learning and behavioral data that students generate when using digital instructional tools, demonstrating the continued faith that many personalized-learning proponents have in the power of 鈥渂ig data鈥 to transform schooling.

But the project, which is dubbed 鈥淟earnSphere鈥 and in some respects echoes the ill-fated attempt by controversial nonprofit inBloom to facilitate the collection and sharing of large amounts of educational information, also raises raising new questions in the highly charged debate over student-data privacy.

The initiative鈥攚hich was awarded a $4.8 million grant from the National Science Foundation鈥攚ill be led by researchers at Carnegie Mellon University, in Pittsburgh, who propose to construct a new data-sharing infrastructure that is distributed across multiple institutions, including third-party and for-profit vendors. When complete, LearnSphere is likely to hold a massive amount of anonymous information, including:

鈥 鈥淐lickstream鈥 and other digital-interaction data generated by students using digital software provided to schools by the universities and vendors participating in LearnSphere;

鈥 Chat-window dialogue sent by students participating in some online courses and tutoring programs;

鈥 Potentially, 鈥渁ffect鈥 and biometric data, including information generated from classroom observations, computerized analysis of students鈥 posture, and sensors placed on students鈥 skin, in order to track measures such as student engagement.

New Insights

Proponents say that facilitating the sharing and analysis of such information for research purposes can lead to new insights about how humans learn, as well as rapid improvements to the digital learning software now flooding schools.

鈥淲e are going to be able to understand the learning and instructional processes so much better than we already do,鈥 said Kenneth R. Koedinger, a professor of human-computer interaction and psychology at Carnegie Mellon and the principal investigator on the NSF grant. The other universities participating in the project are the Massachusetts Institute of Technology, Stanford University, and the University of Memphis.

But prominent student-data-privacy advocates warned that rapid expansion in the collection of students鈥 digital data, even when done primarily for research purposes, is fraught with potential problems related to notification, consent, and data ownership.

At a minimum, LearnSphere 鈥渨arrants further evaluation鈥 before moving forward, said Khaliah Barnes, a lawyer with the Electronic Privacy Information Center, a Washington-based advocacy group.

鈥淚t鈥檚 not at all clear that parents and students are fine with having their information data-mined in this way,鈥 Ms. Barnes said.

Information-Sharing Goals

LearnSphere鈥檚 grant was one of 14, totaling $31 million that the NSF announced earlier this month as part of its Data Infrastructure Building Blocks program, also known as DIBBS.

The goal of the program is to promote interdisciplinary collaboration and innovation in a wide variety of scientific fields.

NSF officials said education may finally be ready to catch up to data-related advances in other fields.

鈥淲e鈥檙e now able to collect massive amounts of information on individual students we weren鈥檛 able to collect 10 years ago,鈥 said John C. Cherniavsky, a senior advisor for research at the NSF. 鈥淚t presents an opportunity in the education-research domain that has been available in the physical sciences for decades.鈥

LearnSphere won funding, the NSF officials said, in large part because the effort will build off of extensive work that researchers at Carnegie Mellon have already done.

Mr. Koedinger first won acclaim in the early 2000s for his involvement in the development of adaptive-learning software known as Cognitive Tutor, which uses big-data analysis to help spot the points at which students get stymied or disrupted in their mathematics-learning process, then provides targeted help to get them back on track.

In 2004, with NSF funding, Carnegie Mellon and the University of Pittsburgh jointly founded the Pittsburgh Science Learning Center to study human learning and apply the findings to the development of teaching tools. The center in recent years created DataShop, an educational data repository that holds information from more than 550 datasets.

Cognitive Tutor software, used by about 600,000 middle and high school students in the United States, provides a large chunk of data to the repository, Mr. Koedinger said.

Additional information is generated by students鈥 use of other interactive tutor systems and software programs, digital educational games and simulations, and massive open online courses, or MOOCs. Some of those tools come from universities, and some come from private companies and developers.

Detecting Emotional States

The data to be stored in the LearnSphere database and analyzed by researchers will be far-reaching, Mr. Koedinger said, likely including records of every mouse click a student makes when using a software program and information demonstrating a student鈥檚 thought process when attempting to solve a problem in an online simulation.

The LearnSphere data will also likely include the text that college students type when participating in a discussion board for a MOOC, and that K-12 students enter when interacting with a dialogue-based adaptive-tutoring system.

And it may also include information on what Mr. Koedinger described as students鈥 鈥渁ffective emotional states,鈥 such as whether they are bored or frustrated, as gauged through either classroom observations or sensor technology that can detect an individual鈥檚 posture, his or her skin鈥檚 conductance of electricity, and more.

Part of what Mr. Koedinger hopes will make LearnSphere powerful is the ability to connect such varying data streams to each other in order to conduct large-scale analyses.

Already, he said, 鈥渨e have shown some pretty interesting results in being able to detect different [emotional] states from keystroke data.鈥

Such findings, Mr. Koedinger said, might be used to improve the ability of adaptive software to determine when a student is losing interest in a digital lesson, allowing the program to provide on-the-spot encouragement or remedial help.

Analysis of the types of data that LearnSphere proposes to store can also lead to surprising insights about how to best teach students, Mr. Koedinger said. He cited recent findings by his team at Carnegie Mellon that, contrary to conventional wisdom, showed students seem to learn algebra better when they are first introduced to problems in the form of a story, rather than in the form of an equation.

James Paul Gee, an education professor at Arizona State University, in Tempe, and an expert on the uses of data generated by digital games, said that 鈥渃herished theories鈥 in other fields have already been upended by similar big-data analyses, especially those in which information is shared across institutions.

He pointed, for example, to medicine, where informational devices inserted into the body can now provide medical professionals with a constant, real-time stream of information on patients鈥 biochemistry, allowing for a much richer and more accurate portrait of an individual鈥檚 health than can be ascertained through check-ups or human monitoring.

Mr. Koedinger said the field of education is ripe for something similar.

鈥淥ur sense of learning from our conscious experience is just a slim sliver of what is actually happening in our brains,鈥 Mr. Koedinger said. 鈥淚t鈥檚 a lot more complex than we think it is.鈥

Collection Concerns

But Mr. Gee is also among those worried that such large-scale educational data collection efforts will 鈥渃reate more noise, not more signal,鈥 effectively obscuring the very things researchers are hoping to learn.

That鈥檚 especially true, he said, given the ways in which researchers and vendors are increasingly prioritizing digital over other types of data about how students learn, such as human observations of student-to-student interactions.

Such concerns about 鈥渙ver-collection鈥 of digital learning data are also part of what troubles Ms. Barnes, the EPIC lawyer.

鈥淲e鈥檙e increasingly operating outside the parameters of FERPA,鈥 she said, referring to the Family 91制片厂视频al Rights and Privacy Act, a 40-year-old federal statute that remains the primary law in place to protect students鈥 privacy.

鈥淲e talk about modern privacy as being about an individual鈥檚 right to control the information they鈥檝e entrusted to others,鈥 Ms. Barnes said, 鈥渂ut it appears [with LearnSphere] that students will lose significant control.鈥

Guarding Anonymity

Indeed, Mr. Koedinger said the new effort bears some similarity to the work attempted by Atlanta-based nonprofit inBloom, which closed its doors in April in the wake of stiff opposition from parents and advocates concerned with the privacy and security of children鈥檚 sensitive information. Like inBloom, LearnSphere will involve the storage of massive amounts of student information and enable those data to be shared more easily.

But unlike the researchers鈥 effort, which will facilitate data-sharing among researchers and some private companies and developers, inBloom aimed to sit between schools and vendors. The nonprofit also sought to collect and store personally identifiable student information directly from schools, which LearnSphere will not do.

For their parts, both Mr. Koedinger and officials from the NSF acknowledged potential privacy concerns, but said protections, including approval by university institutional review boards, in some instances, will be in place.

Mr. Koedinger conceded, however, that in many cases, the software and other digital learning tools that are feeding data to LearnSphere will be operating outside of formal research studies. In those cases, he said, the potential for 鈥渄e-identified鈥 or anonymous information to be shared with third parties may or may not be disclosed to schools and districts through a formal statement.

It鈥檚 the latter point that most worries Leonie Haimson, the co-chairperson of the Parent Coalition for Student Privacy and a leading voice in the opposition that ultimately toppled inBloom.

鈥淚n general, we have nothing against research that is done with fully anonymized data,鈥 Ms. Haimson said. 鈥淏ut I think that any university involved in such a data [repository] has to make sure that the original collection of data was done ethically, with full consent and notification. They shouldn鈥檛 leave it up to vendors.鈥

Coverage of trends in K-12 innovation and efforts to put these new ideas and approaches into practice in schools, districts, and classrooms is supported in part by a grant from the Carnegie Corporation of New York at . 91制片厂视频 Week retains sole editorial control over the content of this coverage.
A version of this article appeared in the October 22, 2014 edition of 91制片厂视频 Week as 鈥楤ig Data鈥 Research Effort Faces Student-Privacy Questions

Events

Recruitment & Retention Webinar Keep Talented Teachers and Improve Student Outcomes
Keep talented teachers and unlock student success with strategic planning based on insights from Apple 91制片厂视频 and educational leaders.鈥
This content is provided by our sponsor. It is not written by and does not necessarily reflect the views of 91制片厂视频 Week's editorial staff.
Sponsor
Families & the Community Webinar
Family Engagement: The Foundation for a Strong School Year
Learn how family engagement promotes student success with insights from National PTA, AASA鈥痑nd leading districts and schools.鈥
This content is provided by our sponsor. It is not written by and does not necessarily reflect the views of 91制片厂视频 Week's editorial staff.
Sponsor
Special 91制片厂视频 Webinar
How Early Adopters of Remote Therapy are Improving IEPs
Learn how schools are using remote therapy to improve IEP compliance & scalability while delivering outcomes comparable to onsite providers.
Content provided by 

EdWeek Top School Jobs

Teacher Jobs
Search over ten thousand teaching jobs nationwide 鈥 elementary, middle, high school and more.
Principal Jobs
Find hundreds of jobs for principals, assistant principals, and other school leadership roles.
Administrator Jobs
Over a thousand district-level jobs: superintendents, directors, more.
Support Staff Jobs
Search thousands of jobs, from paraprofessionals to counselors and more.

Read Next

Data Q&A College-Success Algorithms Often Get It Wrong for Students of Color
New research shows the models used to predict higher education achievement could hurt students on both sides of the equation.
3 min read
Illustration of pop up windows and notifications of different programs and applications
iStock/Getty
Data A New Digital Divide? Low-Income Students See More Ads in the Tech Their Schools Use
Students from the lowest-income families are the most likely to attend schools that do not systematically vet their education technology.
4 min read
Group of Students in IT Class
iStock
Data What the Research Says What Does 'Evidence-Based' Mean? A Study Finds Wide Variation.
Fewer than 1 in 3 education interventions get consistent judgments on their evidence base from reviewers.
5 min read
photograph of a magnifying glass on an open book
Valiantsin Suprunovich/iStock
Data 'Hidden Homeless': A Key Measure of Homelessness Excludes Most Students
Federal agencies differ in how they measure homelessness鈥攁nd many vulnerable students are left out.
3 min read
Photograph of a low angle view of children with backpacks climbing the school staircase.
E+/Getty