Sunday, July 14, 2019

After InstructureCon: Yes, I'm still hoping for that data opt-out!

Last week, I did a round-up post focused on InstructureCon, summarizing my many concerns about Instructure's new AI experiments. Back in March, CEO Dan Goldsmith announced a big shift for Instructure: instead of just giving teachers and schools access to data for traditional statistics as in the past, Instructure itself would be analyzing our students, profiling them in order to create predictive algorithms for future business growth, while doubling their TAM as Goldsmith claimed:

InstructureCon updates on DIG

So, after InstructureCon we know a lot more about this AI project, called DIG. For example, Goldsmith now claims: We can predict, to a pretty high accuracy, what a likely outcome for a student in a course is, even before they set foot in the classroom. 

Personally, I find this claim hard to believe, given that the only data Instructure has to work with is the isolated, low-level data they gather from Canvas activity: log-ins, page views, quizzes, gradebook, etc. Unizin schools add demographics to that Canvas data (which I find even more alarming), but it sounds like Goldsmith is making the claim about Canvas data itself.

In any case, speaking for myself, I do not want Instructure to tell me how to do my job ("we can make recommendations..."), prejudicing my views of students before I have even met them. My school currently does not share a student's GPA with me, and for good reason; as I see it, Instructure's labeling of students in this way is no different than sharing their GPA. In fact, I would suspect that past grade data is a very significant component in Instructure's prediction engine, perhaps even the most significant component. But hey, it's their proprietary AI; I'm just guessing how it might work, which is all we can do with corporate AI/ML experiments.

Notice also the slipperiness of the word "outcome" in Goldsmith's claims about predictive accuracy. When teachers think about outcomes, we are thinking about what students learn, i.e. the learning they can take away with them from the class (what comes out of the class), especially the learning that will be useful to them in their later lives. And that's very complex; there is a whole range of things that each student might learn, directly and indirectly, from a class, and at the time of the class there's no telling what direction their lives might take afterwards and what might turn out to be useful learning along that life path. But the LMS has no record of those real learning outcomes. In fact, the LMS has no real measures of learning at all; there are only measures of performance: performance on a quiz, performance on a test, attendance, etc. So when Goldsmith talks about predicting the "likely outcome" for a student, what I suspect he means is that Instructure is able to predict the likely final grade that the student will receive at the end of a class (which is why I suspect GPA would be a big component in that prediction). But the grade is not the learning, and it is not the only outcome of a class. In fact, I would argue that we should not be using grades at all, but that is a topic for a separate discussion.

What about a data opt-out?

So, now that we know more about the goals of DIG, what about opting out? There was no announcement about an opt-out, and no mention even of the possibility of an opt-out. Goldsmith even claimed in an interview that there hasn't been any request for an opt-out: "We haven’t had that request, honestly." 

Well, that claim doesn't make sense as I myself had a long phone conversation with two VPs at Instructure about my opt-out request. What Goldsmith must mean, I suppose, is that they have not had a request at the institutional level for any campus-wide opt-outs, which is not surprising at all. While it would be great if we had some institutional support for our preferences as individual users, I would be very surprised if whole institutions decide to opt out. Predictive analytics serve the needs of institutions far more than they do the needs of individual teachers or students, and I can imagine that institutions might be eager to see how they can use predictive analytics to look at school-wide patterns that are otherwise hard to discern. Teachers can grok what is going on in their individual classrooms far more easily than provosts and deans can grok what is going on across hundreds and thousands of classrooms. 

Yet... there is hope!

Yet I still have some hope for an opt-out, because I learned from that same Goldsmith interview that individuals OWN their data: One of our first and primary tenets is that the student, the individual and the institution own the data—that’s their asset. 

And he says the same in this at video interview here: we own our data.

This concession about data ownership really caught me by surprise, in a good way, and renewed my hope for an opt-out. If individuals own their data, then we should be able to take our data out of the Instructure cloud when a course is over if we choose to do so. In other words: a data opt-out, perhaps with the same procedure that Instructure already uses to sunset data from schools that terminate their Instructure contract.

In fact, in the context of ownership, it really sounds more like an opt-in is required. If Instructure wants to use my data — data about me, my behavior, my work, my OWN data  then they should ask me for my permission. They should ask for permission regarding specific timeframes (a year, or two years, or in perpetuity, etc.), and they should ask for permission regarding specific uses. For example, while I strongly object to AI/ML experiments, there might be other research to which I would not object, such as a study of the impact that OER has on student course completion. Not all data uses are the same, so different permissions would be required.

Of course, as I've said before, I am not optimistic that Instructure is going to implement an opt-in procedure — even though they should — but I am also not giving up hope for a data opt-out, especially given the newly announced Canvas data tenets.

Canvas Data Tenets

In addition to this surprising concession about data ownership, we learned about these new Canvas data tenets at InstructureCon. In the video interview cited above, Goldsmith promised a post about data tenets coming soon at the Instructure blog, and there was already this slide in circulation at InstructureCon, which I assume are the data tenets Goldsmith is referring to in the interview (strangely, even the Instructure staff keynotes were not livestreamed this year, so I am just relying on Twitter for this information). As you can see, one of those tenets is: Empower People, don't Define Them.

Now, the language here sounds more like marcomm-speak rather than the legal or technical language I would expect, but even so, I am going to take heart from this statement. If Instructure promises to empower me, then surely they will provide a data opt-out, right? It would not be empowering if Instructure were to take my Canvas data and use it for an experiment to which I do not consent, as is currently the case.

My Canvas Data Doubts

Meanwhile, that tension between empowering people, not defining them, is what I want to focus on in the final part of this blog post. I saw really mixed messages from InstructureCon this year, as the big keynotes from Malcolm Gladwell, Dan Heath, and Bettina Love were all about community, peak moments, love and creativity... with a corporate counterpoint of big data and a billion Canvas quizzes as I learned via Twitter:

See also the contradiction where Goldsmith claims in an interview that Instructure is all about "understanding the individuals, their paths, their passions, and what their interests are" and what we see in the data dashboards: there are no passions and interests on those dashboards (but I do know those red "missing" labels all too well):

Impersonal personalization

There's a single word that I think expresses this dangerous ambivalence in ed-tech generally, and at Instucture in particular; that word is personalization. On the one hand, personalization looks like it would be about persons (personal agency, personal interactions, personal passions) but personalization has also become a codeword for the automation of education. Both in terms of philosophy and pedagogy, automation sounds really bad... but personalization: ah, that sounds better, doesn't it?

So, for example, listen to what Dan Goldsmith says in this interview: it's technology inevitablism, literally. (video hereSo when you think about adaptive and personalized learning I think it's inevitable that we as an educational community need to figure out ways of driving more personalized learning and personalized growth experiences.

I'm not going to rehash here all the problems with the rhetoric of personalization; Audrey Watters has done that for us, as in this keynote (among others): Pigeons and Personalization: The Histories of Personalized Learning. (A good all-purpose rule for thinking about ed tech: READ AUDREY.)

Instead, I will just focus here on the impersonality of Canvas data, listing five big reasons why I mistrust that data and Instructure's claims about it:

1. Canvas data measure behavior, not learning. Canvas is an environment that monitors student behavior: log on, log off; click here, click there; take this quiz, take that quiz; this this many words, download this many files, etc. If your educational philosophy is based on behaviorism, then you might find that data useful (but not necessarily; see next item in this list). If, however, your educational philosophy is instead founded on other principles, then this behavioral data is not going to be very useful. And consider the keynote speakers at InstructureCon: none of them was advocating behaviorism; just the opposite. Here's Bettina Love, for example, on liberation, not behaviorism (more on her great work below):

2. Canvas fails to gather data about the why. Even for purposes of behavior modification, that superficial Canvas data will not be enough; you need to know the "why" behind that behavior. If a student doesn't log on to Canvas for a week, you need to know why. If a student clicks on a page but spends very little time there, you need to know why. If a student does poorly on a quiz, you need to know why. For example, if a student got a poor score on a quiz because of a lack of sleep that is very different from getting a poor score because they did not understand the content, which is in turn very different from being bored, or being distracted by problems at home, etc. Just because students completed a billion quizzes in Canvas does not mean Instructure has all the data it needs for accurately profiling those students, much less for making predictions about them.

3. Canvas data are not human presence. The keynote speakers consistently emphasized the importance of people, presence, relationships, and community in learning, but numbers are not presence. Does this look like a person to you? This is how Canvas represents a student to me right now; the coming data dashboard (see above) uses the same numbers repackaged, because that is all that Canvas has to offer me: numbers turned into different kinds of visualizations.

Goldsmith claims that Instructure is different from other learning companies because they are all about people's passions and interests, but that claim does not fit with the views I get of my students in the Canvas Dashboard and the Canvas Gradebook: no passions, no interests; just numbers. I don't need percentage grades, much less the faux-precision of two decimal points. Instead, I need to know about students' passions and interests; that would be exactly the information that will help me do my job well, but Canvas data cannot provide that data.

4. Canvas data does not reflect student agency. The basic pedagogical design of Canvas is top-down and teacher-directed. Student choice is not a driving principle; in fact, it is really a struggle to build courses based on student choice (I will spare you the gory detail of my own struggles in that regard). Students cannot even ask questions in the form of search; yes, that's right: students cannot search the course content. The only access to the course content is through the click-here-click-there navigation paths pre-determined by the instructor. And, sad to say, there is apparently no fix in sight for this lack of search; as far as I could determine, there was no announcement regarding the deferred search project from Project Khaki back in 2017 (details here). Think about that lack of search for just a minute. It's no accident that Google started out as a search engine; the questions that people brought to Google, and people's choices in response to those answers, generated the behavioral surplus juggernaut that now powers Google AI. Netflix succeeds as a prediction engine precisely because it is driven by user choice: lots of options, lots of choices, and lots of data about those choices with which to build the prediction engine. The way that Canvas forestalls student choice, including the simple ability to initiate a search, is why I believe their AI project is going to fail. (Meanwhile, if I am wrong and there was an announcement about Canvas search at InstructureCon, let me know!)

And this last item is actually the most important:

5. Canvas data cannot measure obstacles to student learning. By focusing data collection on the students, Instructure runs the risk of neglecting the social, political, and economic contexts in which student learning takes place. Whether students succeed or fail in school is not simply the result of their own efforts; instead, there are opportunities and obstacles, not evenly distributed, which are crucially important. Does Canvas data record when students are hungry or homeless or without health insurance? Does Canvas data record that a course is taught by a poorly paid adjunct with no job security? As Dave Paunesku wrote in Ed Week this week, "When data reveal students' shortcomings without revealing the shortcomings of the systems intended to serve them, it becomes easier to treat students as deficient and harder to recognize how those systems must be changed to create more equitable opportunities." I hope everybody will take a few minutes to read the whole article: The Deficit Lens of the 'Achievement Gap' Needs to Be Flipped. Here's How. (Short answer: another billion quizzes is not how you flip the deficit lens.)

Of course, this is all a topic for a book, not a blog post, so I'll stop for now... but I'll be back next week to start a new approach to these datamongering round-ups: a commentary on Shoshana Zuboff's Surveillance Capitalism. Of all the concepts in play here, the one that is most important to me is what Zuboff's calls our "right to the future tense." So, I will work through her book chapter by chapter in the coming weeks, and hopefully that will make it more clear just why it is that I object so strongly to Instructure's predictive analytics experiment.

~ ~ ~

I want to close here with Bettina Love's TED talk; take a look/listen and see what you think: I think she is brilliant! More also at her website.

Speaking for myself, I'll take dance and lyrics over data analytics any day. So, keep on dancing, people! And I'll be back next week with Shoshana Zuboff's book and our right to the future tense. :-)

Sunday, July 7, 2019

Data Mongering (12): Special InstructureCon Edition

I began this #datamongering project back in March when I first learned of Instructure's plans to exploit existing user data to create predictive algorithms built with AI and machine learning: Soylent Canvas. I am still hoping that those of us who are opposed to the use of predictive algorithms in education will be able to OPT OUT so that Instructure will not be able to use our data to develop its algorithms and train its AI system (even better would be opt-in, but I don't actually have any hope for that one).

I submitted a question about data opt-out to the InstructureCon Engineering panel (my question). Coincidentally (?), Instructure then published a blog post about its privacy policy, so I reiterated that my question is not about privacy; it is about opting out of Instructure's plans to mine my data, all our data, for machine learning (my follow-up):

In talking to people about this, I've found that many educators are still not really sure just what AI and predictive algorithms mean for education, how LMS companies do data mining, what the difference is between machine learning and traditional statistical analysis, etc. etc. Over the past five months, I've been collecting online materials on these topics, so in this special "InstructureCon Edition" of my #datamongering round-ups, I've listed what I see as some of the most valuable resources people can use to learn more. Read on:

1. Instructure: Plans to expand beyond Canvas LMS into machine learning and AI by Phil Hill. This blog post is where I first learned about the big shift at Instructure, and you will find extensive quotes from Instructure's new CEO, Dan Goldsmith. This is a must-read for anyone whose school is using Canvas LMS:

For more on ed tech companies and their data, see also: EdTech Companies With The Most Student Data by Justin Menard.

2. Despite Dan Goldsmith's claims about Instructure's database, there is nowhere near enough data in Canvas to model real learning by real students. What kind of surveillance will be required to get the actual data required? China has a Next Generation Artificial Intelligence Development Plan (NGAIDP) that is bringing full-scale student surveillance to the classroom; there is detailed reporting here from Xue YujieCamera Above the Classroom. If you are going to read just one article on AI in education, this is the one to read.

3. For a student perspective, you can listen to the story of Bryan Short, a student at the University of British Columbia in Canada. There is an interview with Bryan at EdSurge: Inside a Student’s Hunt for His Own Learning Data (podcast with transcript), plus an article from the UBC student newspaper that puts Bryan's story in context: Canvas is tracking your data. What is UBC doing with it? by Zak Vescera.

As you can see, if the LMS does not give students the opportunity to opt out, things get very complicated as Bryan learned when he opted out on his own. This is why Instructure needs to give individuals more control over who is allowed to use their data and for what purposes.

4. Increasing surveillance of students is an issue of great concern for both higher ed and for K-12. On K-12, see this important piece by Benjamin Herold in EdWeek: Schools Are Deploying Massive Digital Surveillance Systems. The Results Are Alarming.

5. For resisting surveillance, and LMS surveillance in particular, you will find a good discussion here: Ethics and LMS Surveillance which is part of #DHSI19: Balancing Issues of Critical Digital Pedagogy containing contributions from Chris Friend and many others.

And for more, see also Erin Glass writing at HASTAC: Ten weird tricks for resisting surveillance capitalism in and through the classroom.

6. By mining student work to create new products, Instructure is following the lead of TurnItIn, a company which recently sold for $1.75 billion (not a typo). For an overview, see Automating Mistrust by Ben Williamson.

Also, this piece on TurnItIn from two years ago is still as relevant as ever: A Guide for Resisting Edtech: the Case against Turnitin by Sean Michael Morris and Jesse Stommel.

7. Did you notice that Canvas rebranded itself in June as a platform, not just an LMS? (details at the official Canvas blog). For an idea of just what the platforming of education means, here's a great piece, also from Ben Williamson: The platform university: a new data-driven business model for profiting from HE.

And for more on education-as-platform, see also Platform Capitalism and the Governance of Knowledge Infrastructure by Leslie Chan.

8. Matt Crosslin is more optimistic than I am that there is real value in data analytics, and he also recognizes some real pitfalls too; this blog post provides a great overview: So What Do You Want From Learning Analytics?

And for some perspective over time, see Lawrie Phipps's disenchanted take on algorithms: Postdigital: 10 years later, Algorithms and Agency.

9. Anyone going forward with algorithms needs to be aware of the dangers involved, and there are indeed many dangers. This resource from MIT points out some of them: AI Blindspot: A discovery process for spotting unconscious biases and structural inequalities in AI systems.

And here's another good read from a teacher's perspective: 10 Ways Data Can Sabotage Your Teaching by Terry Heick.

10. Finally I want to close with a brilliant film from sava saheli singh's project Screening Surveillance. The film is not about education, but it's easy to see just how this model employee could be re-imagined as a model student. Leila Khalilzadeh is the director, with a screenplay by Tim Maughan: Model Employee.

So, keep on reading, people! We cannot afford to be ignorant about AI, because . . . The AI Supply Chain Runs on Ignorance.

And if anybody is at the Instructure Engineering panel at InstructureCon on Thursday (July 11) 4:20PM in the Long Beach Convention Center, GB-B, please let me know if they say anything about a data opt-out. I don't know if my question will make the cut or not... but I have not given up hope yet.

Sunday, June 30, 2019

Data Mongering (11): A TurnItIn-Amazon-Gates Trifecta

This is my eleventh round-up; you can see them all here: Data-Mongering Round-Ups. As always, no shortage of items to report on!

I want to start with an announcement for Canvas users: at InstructureCon, there will be an AMA-style panel with engineering leadership from Instructure, and you can submit questions in advance here. I submitted a question about data-mining, and also one about search (yep, they mine our data but we cannot search our own course content; details). So, chime in at the Canvas Community in advance and, if you'll be at InstCon, the panel itself is Thursday, Jul 11 at 4:20-5:00 PM.

And now, this week in datamongering:

An important new blog post from Ben Williamson on TurnItIn: Automating mistrust. I see TurnItIn as being the ominous harbinger of an approach we now see spreading throughout the related world of the LMS, so this is an important read for all educators, not just those of us who teach writing. quote "Turnitin is also reshaping relationships between universities and students. Students are treated by default as potential essay cheats by its plagiarism detection algorithm. [...] Turnitin’s continued profitability depends on manufacturing and maintaining mistrust between students and academic staff, while also foregrounding its automated algorithm over teachers’ professional expertise." Ben's post contains lots of links in turn to pieces by Jesse Stommel, John Warner, Lucas Introna, and others and he also discusses an aspect of TurnItIn operations that I find especially troubling: the WriteCheck service which allows students to TurnItIn-proof their work before they submit it, for a steep fee of course. The student who first alerted me to the existence of WriteCheck dubbed it "Write-Me-A-Check" ($8 per paper, discounts for repeat users).

Plus more about TurnItIn in the news this week at CampusTechnology: Turnitin Partnership Adds Plagiarism Checking to College Admissions. In response to that, an excellent comment from Susan Blum:

Susan would know; she is the author of My Word!: Plagiarism and College Culture (the Kindle is just $7.99, people!). Table of contents: 1 A Question of Judgment / 2 Intertextuality, Authorship, and Plagiarism / 3 Observing the Performance Self / 4 Growing Up in the College Bubble / 5 No Magic Bullet.

Meanwhile, this piece from Anya Kamenetz at NPR has a theme that is really relevant to the question of (mis)trust: instead of monitoring, we need to be mentoring! At Your Wits' End With A Screen-Obsessed Kid? Read This. quote "Heitner advises that families like this one need to switch from monitoring to mentoring. Policing their kids' device use isn't working. They need to understand why their kids are using devices and what their kids get out of those devices so they can help the kids shift their habits." (Devorah Heitner is the author of Screenwise: Helping Kids Thrive (and Survive) in Their Digital World.) This same advice applies IMO to teachers: if students are not writing well, policing with TurnItIn is not going to give us the information we need to do better. Instead, we need to understand why students write well, or not, and what we can do to create more meaningful writing/learning experiences.

And now, moving on from TurnItIn this week to... Amazon. There is a great piece by Will Oremus at OneZero: Amazon Is Watching. quote "Imagine Ring surveillance cameras on cars and delivery drones, Ring baby monitors in nurseries, and Amazon Echo devices everywhere from schools to hotels to hospitals. Now imagine that all these Alexa-powered speakers and displays can recognize your voice and analyze your speech patterns to tell when you’re angry, sick, or considering a purchase. A 2015 patent filing reported last week by the Telegraph described a system that Amazon called “surveillance as a service,” which seems like an apt term for many of the products it’s already selling." 

Amazon has yet to make its big play for education; will it be Alexa in schools everywhere...? More on EchoDot for kids, plus a lawsuit on Amazon child surveillance). And don't forget the drones: With Amazon’s New Drone Patent, The Company’s Relationship With Surveillance Is About To Get Even More Complicated.

And on Amazon Rekognition, see this important piece: Amazon's Facial Analysis Program Is Building A Dystopic Future For Trans And Nonbinary People by Anna Merlan and Dhruv Mehrotra at Jezebel. This is a long and detailed article, with both big-picture information and also results of a specific Rekognition experiment. quote "Rekognition misgendered 100% of explicitly nonbinary individuals in the Broadly dataset. This isn’t because of bad training data or a technical oversight, but a failure in engineering vocabulary to address the population. That their software isn’t built with the capacity or vocabulary to treat gender as anything but binary suggests that Amazon’s engineers, for whatever reason, failed to see an entire population of humans as worthy of recognition."

And to complete the trifecta this week, here's more on Bill Gates's ambitions for higher ed via John Warner at IHE: Bill Gates, Please Stay Away from Higher Education. quote "These large, seemingly philanthropic efforts undertaken by billionaires like Gates are rooted in a desire to preserve the status quo where they sit atop the social order. Rather than putting his money into the hands of education experts or directly funding schools or students, he engineers programs, which replicate his values."

And for a related fail in education this week: AltSchool’s out: Zuckerberg-backed startup that tried to rethink education calls it quits. quote "AltSchool wooed parents and tech investors with a vision of bringing the classroom into the digital age. Engineers and designers on staff developed software for assisting teachers, and put it to work at a group of small schools in the Bay Area and New York run by the startup. At those outposts, kids weren’t just students; they served as software testers, helping AltSchool refine its technology for sale to other schools." Specifically on the subject of students as software testers, see these concerns expressed much earlier about exploiting students as data sources from Connie Loizos at TechCrunch: AltSchool wants to change how kids learn, but fears have surfaced that it’s failing students. quote "Compounding their anger these days is AltSchool’s more recent revelation that its existing network of schools, which had grown to seven locations, is now being pared back to just four — two in California and two in New York. The move has left parents to wonder: did AltSchool entice families into its program merely to extract data from their children, then toss them aside?"

And yes, there are more items that I bookmarked... but surely that's enough for this week. Eeek. 

On an up side, thanks to Tom Woodward I learned about this data-mongering resistance tool: it opens a 100 tabs in your browser designed to distort your profile. I'm not sure I want non-stop streetwear ads... but it would definitely skew my profile which currently delivers an endless stream of ads for books (no surprise) and for, yep, CanvasLMS, ha ha, as if I am in the market for an LMS. More at

And the graphic this week also comes from Tom at Twitter:

Plus XKCD on predictive modeling........

No docks at midnight... but I'll see you here again next week. And if you have #datamongering items to share at Twitter, use the hashtag and we can connect.