Sunday, April 21, 2019

Data-Mongering (2): Algorithms and Agency, GIGO, and more

This is my second of these round-ups; you can see them all here: Data-Mongering Round-Ups. I'm also annotating as I go in Diigo, so you can the Diigo files here: Data-Mongering articles. The editorial comments there are just copied-and-pasted from the blog posts.

It's depressing to keep on reading and learning about this, but especially now that I'm reading Shoshana Zuboff's The Age of Surveillance Capitalism, I can see that the shift is happening at Instructure just as it has in one company after another: under their new CEO, Instructure has realized that its "collateral" data can actually be commodified, turning user behavior into the product that is being sold (Soylent Canvas). Who would have thought that the main outcome of the digital revolution in education would be the triumphant return of behaviorism...? Eegad. Skinner would be so happy. And I am not.

Anyway, here are some of the things I read this week that made me stop and think:

Postdigital: 10 years later, Algorithms and Agency by Lawrie Phipps. This piece gets at so many of my deep concerns right now, and it's "looking back" perspective to 10 years ago shows what a dramatic shift there has been in the normalizing of expansive digital networks, both good and bad. For example, re: TurnItIn and their ilk: "Would the sector have been so fast to sign up to a plagiarism service 10 years ago, if they had known all the student IP would one day be the property of a publishing company?" I too was wildly naive 10 years ago, and guilty of some techno-evangelism I guess. I still love teaching online, but more and more I see the technological space in which I am supposed to do my work (Canvas) as being a threat, not a resource. quote "The naive utopia we described in our 2009 postdigital paper probably only exists in the minds of idealists and tech evangelists. People have designed digital tools, platforms, and other environments with political and financial motives. In our current postdigital world, digital does not serve the social, but through the manipulation of people, it is driving a particular kind of society, one that exploits the weaknesses and and fears of people; enables the rise of racism and xenophobia, and intensifying inequality."

Why ‘learning analytics’? Why ‘Learning Record Stores’? by Donald Clark. I don't always agree with Donald Clark, but I think he is spot on in his criticism of learning analytics hype here: "Perhaps the best use of data is dynamically, to create courses, provide feedback, adapt learning, text to speech for podcasts and so on. This is using AI in a precise fashion to solve specific learning problems. The least efficient use of data is storing it in huge pots, boiling it up and hoping that something, as yet undefined, emerges" (that last bit sounds just like the pie-in-the-sky claims by Instructure's CEO that just because they have lots of data they can get lots of use out of it). Specifically on AI and learning behavior: "Recording what people just ‘do’ is not that revealing if they are clickthrough courses, without much cognitive effort. Just showing them video, animation, text and graphics, no matter how dazzling is almost irrelevant if they have learnt little. This is a classic GIGO problem (Garbage In, Garbage Out)."

One Way to Reduce Gender Bias in Performance Reviews by Lauren Rivera and AndrĂ¡s Tilcsik. This is a fascinating piece at Harvard Business Review that warns us to be suspicious of any measurement because the measuring stick itself shapes the data in ways that we never realized or intended, like the way that women are more discriminated against if you use a 10-point rating scale as opposed to a 6-point scale. So, before we start measuring everything, we need to stand back and think about the prejudices that are going to inform/deform every supposedly objective measurement we make.

Institutions’ Use of Data and Analytics for Student Success by Amelia Parnell, Darlena Jones, Alexis Wesaw, and D. Christopher Brooks. This is part of an Educause research project, and it's a good reference point for the ways that schools are trying to use data to improve student success. It is such a slippery slope, and insofar as these systems rely on numbers, and grades in particular, I am dubious. My main concern, though, is the fact that in their eagerness to run their own data experiments, schools have given companies like Instructure way too much freedom to commodify and monetize that student data for purposes that go far beyond any local initiative. From this report, I learned that my school is not alone in a strong focus on first-year retention, and the report also shows that efforts are instead going to advising, tutoring, and counseling, which is again the case at my school. IMO we need to focus on the direct educational mission, in my opinion, not just on ancillary supports. Sadly, the report does not recommend strengthening faculty role or involvement, instead the recommendation is to "identify and expand institutionally appropriate roles for IR, IT, and student affairs." But there was also this bizarre quote plunked down in the middle of the discussion of admin and support services: "As algorithms become more sophisticated, there will increasingly be opportunities for faculty to become more engaged in the delivery of interventions." The one bright spot was this recommendation: Recommendation 4: Increase the use of qualitative data, especially from students. Yes, I say, yes! Student voices please!

Developing Minds in the Digital Age: Towards a Science of Learning for 21st Century Education. Big book (250 pages) from Patricia Kuhl et al. at OECD, which I learned about from Ben Williamson at Twitter. I haven't read the book yet; I was really struck, though, by the capitalization of Big Data and Artificial Intelligence as if they were gods or something; what's up with that??? This image is from Ben Williamson's tweet:

Anyway, the book looks useful, and I will give it a read this summer. 

Insurers Want to Know How Many Steps You Took Today by Mark Raymond. I already knew a lot of the content covered here in the NYTimes article from reading Cathy O'Neil's Weapons of Math Destruction, and the whole "health management" business shows just what dangers await us in the "learning management" business: "As machine learning works its way into more and more decisions about who gets coverage and what it costs, discrimination becomes harder to spot." College pricing is already a nightmare (sticker price, as it were, versus what individual students end up actually paying). That's just one nightmare scenario I can see playing out in future, as colleges become increasingly convinced, rightly or wrongly, that they can predict accurately just how students are going to perform (and creating self-fulfilling prophecies as a result of the biases that they institutionalize in this way...).

And from @YujieXuett, a screenshot (which of course I cannot read) of data-mongering in Chinese schools:

Sunday, April 14, 2019

#TotalCoLearner: a great semester of Hanuman learning

I wrote my first data-mongering round-up yesterday and did a lot of reading on predictive analytics in education... and ugh, it's all worse than I expected. But learning is good, and the better informed I become, the more I useful I can be in voicing opposition to this dehumanized education.

Meanwhile, before the weekend runs away, I wanted to write up something about the #TotalCoLearner experiment this semester in Indian Epics, because it has gone GREAT. I've got a series of #TotalCoLearner posts over at my Canvas blog, plus tweets, but this is my first post here about #TotalCoLearner, and it's perfect timing since I just wrote my last story of the semester yesterday, and I'll be wrapping up the class soon, finishing up early as some of my students do too.

So, what is #TotalColearner? The idea is that I do the whole course just like a regular student! That means you can see my course blog, just as the students each have their own blog, and I also have a course project website, just as the students do. One of the best things about all of this is that students comment on my blog and on my website at random just like they comment on the work of other students at random. And yes, they are surprised to find out that I am a student in the class, and it's kind of a weird surprise, but a good one; you can see their comments on my Introduction post this semester here. You can also see their comments on my project at the Comment Wall.

I keep track of my progress as the students do, although I use a spreadsheet instead of the Canvas Gradebook... and, honestly, I feel badly about how clunky and primitive the Canvas Gradebook is; the spreadsheet I use is way more easy to configure based on the different ways I want to check my progress (by date, by type of assignment, by my plan for finding up, etc.). For example, you can see here that as one of the few assignments I have left to do, there's one I should do today, which is writing up a famous last words post. I'll do that after I finish this post.

The nature of the course design means I really can do everything exactly as the students do; I don't have to "pretend" anything... I can just be myself. Admittedly, I'm not a typical student, but the whole point of my course design is that there is no "typical" student. Instead, every student shows up here with their own background and interests, their own skills and gaps, their own goals and priorities. Based on all that, each person is choosing what they want to read and write and other work that they do for the class, week by week, sharing their work via their blog and their website. It's because I can choose that I am able to adapt the class to suit my learning needs and goals, and the students are also doing the same thing for their own needs and goals.

The only difference between my work for the class and what the students are doing is that I can't do the weekly "project feedback" assignments because I do feedback already on all the projects every week as part of my job as a teacher. So, no worries: I just replace that assignment with other optional assignments, mixing-and-matching from the available assignments just like the students also do based on what assignments they choose to do (or not). This semester I did extra credit reading posts because I was reading a ton of stuff.

And as a result, oh my gosh, I LEARNED SO MUCH. And that's because I set myself a really cool and new challenge: I immersed myself in the version of the Ramayana from Thailand known as the Ramakien, and I also gave myself a crash course in the arts of Thailand that are inspired by Rama's story (Khon theater masks, temple sculptures, so much beautiful stuff). I've always known about the Ramakien's existence before, and I knew about Suvannamaccha, Hanuman's mermaid lover... but that's all. I had never read the whole thing. So, this semester, I read the whole thing! (In a terrible translation, but alas, there is no good English translation of the Ramakien.)

So, just like the students, I posted my reading notes week by week on the Ramakien (and I also re-read Chitra Divakaruni's Mahabharata novel, Palace of Illusions, plus I read Samhita Arni's new novel based on the Silappadikaram, a south Indian epic; more about that here). For Tech Tip extra credit, I built some randomizing image widgets with Hanuman art from India and from South Asia, and I even learned how to embed image randomizers into a Google Sites page! On the writing side, I also pushed myself in new ways so that the project I ended up writing was actually not an anthology (my usual writing approach), but a true extended narrative that ended up wrapping around at the end back to where I started, with a Ramakien-inspired story about Hanuman's mother to which I added my own big reveal at the end; that's the story I wrote yesterday: Hanuman and Pirakuan. Figuring out how that story would work may be the biggest writing thrill I've ever had... I'm so proud of figuring out how to bring Hanuman's mother back around into the story there at the end. (And, yes, that means my project ending up being centered on women's stories, which is a theme that comes up again and again in student projects in the class too, finding ways to decenter the men's stories so that we can bring more women's voices into the epic world.) I also wrote some stories here at the blog separate from the project that I was really proud of, especially my story about Mandodari, plus one about Arjuna and Hanuman.

So, for finishing up the class, I need to add that final Hanuman-and-Pirakuan story to my Storybook website, as well as adding another image gallery page to the site, along with some wrap-up posts at the blog, which I'll probably do next weekend. And while I will be sad to end this particular learning adventure (there is so much more I still want to learn about the south Asian Ramayana!), I am also so excited about what I will do for the Myth-Folklore class next semester. I've got a huge (HUGE) Brer Rabbit project that I began over winter break, and that is what I am going to use to ignite my participation in the Myth-Folklore class in the Fall.

And yay Brer Rabbit too! :-)

Saturday, April 13, 2019

Data-Mongering: Platform U and Other News of the Week

It's Saturday and part of me wants to write a happy blog post about open-ended pedagogy and colearning... and maybe I'll do that later (see #TotalCoLearner at Twitter), but I think I need to review the data-mongering articles I read this week. In fact, I might try to make this a weekly round-up of sorts, scrolling back through my Twitter feed and sharing links here. These are articles that I read this week; some are new, but some are old which I only now got around to reading.

And what is "Platform U" you might ask? Read on:

The platform university: a new data-driven business model for profiting from HE by Ben Williamson. This article discusses exactly what I see happening at Instructure, and why I am so unhappy about it: 
Despite studies repeatedly showing Turnitin’s high error rate, and considerable concern over the mistrust it creates while monetising students’ intellectual property, its acquisition clearly demonstrates huge market demand for data-driven HE platforms. It changes how students are valued—not for their independent intellectual development but as raw material to be mined for market advantage.

How Ed Tech Is Exploiting Students [premium at Chronicle of Higher Ed] by Chris Gilliard. This is an article from last year warning about the dangers discussed in Williamson's article, focusing specifically on the students' lack of consent in the exploitation of their data:
When we draft students into education technologies and enlist their labor without their consent or even their ability to choose, we enact a pedagogy of extraction and exploitation. It’s time to stop.

Colleges Are Banding Together Digitally to Help Students Succeed. Here’s How [premium at Chronicle of Higher Ed] by Alexander C. Kafka. A truly horrifying piece about Canvas data mining, this time in the context of the Unizin consortium (my school does not belong). This is exactly what Goldsmith at Instructure promised (Soylent Canvas), and now with endorsement from the educational administrators themselves: they really believe in this data nightmare. Sad to see Jared Stein quoted here; I guess the whole Instructure crew really is on board with the new predictive-analytics push where students are reduced to their clickstreams and pageviews (my thoughts on AI Overreach: students are more than the data they leave behind in an LMS!).
Take students’ clickstreams and pageviews on the learning-management system, their writing habits, their participatory clicks during classroom discussions, their grades. Then combine that with information on their educational and socioeconomic backgrounds, their status as transfer students, and so on. You end up with "a unique asset," says Wheeler, in learning what teaching methods work.

Counting the Countless: Why data science is a profound threat for queer people by Os Keyes. The observations here about the state apply to educational institutions also, and it is surely the most vulnerable students who are going to be hurt most by tracking based on predictive analytics driven by LMS data-mining:
So: trans existences are built around fluidity, contextuality, and autonomy, and administrative systems are fundamentally opposed to that. Attempts to negotiate and compromise with those systems (and the state that oversees them) tend to just legitimize the state, while leaving the most vulnerable among us out in the cold. This is important to keep in mind as we veer toward data science, because in many respects data science can be seen as an extension of those administrative logics: It’s gussied-up statistics, after all — the “science of the state.”

Margin of error in data-driven decisions by Robin De Rosa. This is a great piece on the gap between quantitative and qualitative data, especially in education... data has to be more than number-crunching!
When we ask whether there is evidence for something related to learning, we are presuming that we all agree 1) what learning is and 2) what constitutes evidence. I contend that “learning” is broader and messier than what we generally assess, and also that “evidence” has been reductively equated with quantification and with the assumption that environments in education are controlled. At the core, I think the biggest problem is that we forget that humans aren’t just giant brains walking around: we are also a jumble of social contexts, emotions, and circumstances.

10 technologies that will impact higher education the most this year by By Macy Bayern. Yep, predictive analytics, AI, nudges, it's all there. instead of a pull-quote, I will share Robin's tweet. What she said.

And this cartoon that Bob Calder shared with me is a great way to express how all this top-down data-mongering looks very different from the point of view of teachers and students who are being surveilled. The cartoon circulates in lots of languages, but I think it may have started with the Polish version (?):

He likes it!
Have fun playing, little one.

Sunday, April 7, 2019

Curating a Public Domain of Folklore and Mythology

I've been blogging about Canvas here over the past few weeks, but there are other/better/happier things to blog about, especially now that SUMMER is coming (the semester is over on May 4 for me!)... and summer means PROJECTS. More specifically it means PUBLIC DOMAIN projects, so I am going to write up a post here with some thoughts about what the public domain means for me in general, and more specifically what I hope to be doing this summer.

The Freebookapalooza

The public domain of printed books is, for me, the most important resource for teaching my classes, and it is also where I want to focus my efforts when I finally retire from my job (or get laid off... whichever comes first, ha ha). Thanks to the amazing resources at Hathi Trust, Internet Archive, Project Gutenberg, and Sacred Texts Archive, along with other online book projects, there is a wealth of material in the world of folklore and mythology that is available in the form of full-text books online. Mythology and folklore is a field that really lends itself to these public domain resources, and the main way in which I have been curating those public domain resources for folklore and mythology is at my Freebookapalooza.

The Freebookapalooza is just a simple blog where each post at the blog is about a book online, most of which (but not all) are public domain books. For each post, I include basic information about the book along with link(s) to the book online. I also include a table of contents so that, in addition to the book title, there are also the titles of chapters/stories in the book. Most of the books I am collecting are story collections, and having those titles can be really helpful in deciding just how useful the book might be for a specific purpose. I also include an image of some kind: the book cover, an illustration from the book, or some other image that is relevant to the book's contents. I use those images in the randomizers, like in the sidebar of this blog for example.

Right now, I have 1222 books posted at the blog. My goal, in honor of the year 2019, is to get to 2019 books by the end of the year, and I've got a little reminder script to keep me on track towards that goal. Right now I'm just a little bit ahead of schedule, but not by much, as you can see in this screenshot:

The Public Domain of 2019

The reason I wanted to expand the blog this year was because 2019 was a turning point for the public domain: this year books that were published with a 1923 copyright entered the public domain! Back in 1998 Congress extended the 75 year copyright term by another 20 years, which meant that there has been a long freeze on books entering the public domain. But now we are back on track, so that books from 1923 entered the public domain this year, and next year it will be books from 1924, and so on. Of course, there are books published in 1924 and later which are not restricted by copyright, some of which are even in the public domain, such as books published with a Creative-Commons-0 license, "no rights reserved." There are also other Creative Commons licenses, plus books which publishers put online as a public service while retaining the copyright; it's a whole big beautiful world of digital books out there! So, I include a whole range of full-text books online at the Freebookapalooza, but I focus my efforts on the public domain books which can be shared and reused freely, without any limitations.

Curation Strategies

There are lots of ways to think of the curation process, and I see my work in terms of these general tasks:

Description/Annotation. A book title provides a tiny bit of description, but readers need more. In addition to just knowing about the contents of the book, readers also need a heads up about the limitations of the book, especially when it comes to pre-1924 books in the public domain where there is pervasive racism, sexism, colonialism, etc. That is a big aspect of my Brer Rabbit project this summer, so I'll have more to say about that in future posts. As an example of great description and annotation, the late, great John Bruno Hare's prefatory notes for the books at Sacred Texts are a wonderful model. In so many ways, the Sacred Texts Archive has been an ideal and inspiration for me ever since I first got online back in 1998.

Navigation. There is also a certain clunkiness in working with things that come in book form, so just helping people navigate the books is part of the process, especially since these are not necessarily books that you read from cover to cover; instead, you might just be interested in reading a few selected stories from a given book. So, what I need are not just links to the books, but links that go directly to specific stories in those books (and once again Sacred Texts Archive took this path, with books broken up into separate webpages, one page per story, each directly addressable). Ultimately, a remix system would be great; I manually created the thousands of pages in my UnTextbook a few years ago, and it was a fun experiment, but for the next iteration, I want something more flexible. I've proved the UnTextbook can be a fantastic way to approach the reading for the class, with students choosing their own reading pathways: I would like to open that up even more and make it even more configurable by the readers.

Discoverability. I see discovery as taking place through browsing, randomness, and search. I really enjoy making randomizers for the books (I'm presenting on randomizers at Domains19, whoo-hoo!), and I would like to create environments that are good for browsing. Search is also a priority, and it is a real problem too. It helps to be able to search the story titles, but not all story titles are equally revealing. Full-text search works on the book level, but not so well across books, and it also depends the accuracy of the OCR (which ranges from excellent to abysmal; again, Sacred Texts Archive, along with Project Gutenberg, are invaluable as sources of truly digitized text). Ideally, I would be writing up short synopses for the stories that would facilitate searching, with some use of keywords and other forms of tagging.

Time, Time, Time

What's hard about projects like this is that there is never enough time to do all the things you want to do. Should I spend my time finding more books to catalog? Creating active links in the tables of contents? Writing annotations and synopses?

Luckily I enjoy all of these tasks tremendously, and I've been able to take a very casual, unplanned approach to all this work over the past years since it's really just been a side hobby, with most of my efforts focused on course design, not content.

Now, though, I need to start making some real choices. I feel like I've reached my goals with course design, so henceforth I will be focusing my hobby-time on this kind of work, and I want to end up with some products of real value to me and to others. In particular, I want to create a really good Brer Rabbit Resource Book that could be repurposed and even redesigned by the user for different audiences/contexts (Brer Rabbit in an American history class would look different than Brer Rabbit in an English literature class, etc.).

I also have an idea for a "1001 Public Domain Nights" or something like that where I will pick 1001 public domain story collections, choose just one story from each collection, and weave them together into a book where each story will somehow lead to the next story and so on through shared motifs and themes. Even better: a make-your-own 1001 Nights, where each story would have keys that link to other stories and you choose what you want next: another story with a lion? about vengeance? with a happy ending? etc.

And I still want to do Star Trek Aesop where I will retell Aesop's fables using characters (and animal species) from the Star Trek universe.

Yes, these are the kinds of nerdy fantasies that I have for my retirement, ha ha.

Anyway, for the next few weeks I will just keep on messing around... but when summer comes, I really want to start getting serious and thinking about priorities and possibilities so that I can make good use of my time and find the right technology tools to be using too. I've gone a long way with spreadsheets and blogs, but the time has come for a real database and some real cms.

And I'll close with this curation graphic from the ever-inspired Silvia Tolisano at Langwitches: Blogging as Curation. I am excited about my coming summer of curation! :-)