Sunday, April 28, 2019

Data Mongering (3): Bullshit, Sabotage, and Conscientious Objectors

This is my third of these round-ups; you can see them all here: Data-Mongering Round-Ups. I'm also annotating as I go in Diigo, so you can the Diigo files here: Data-Mongering articles. The editorial comments there are just copied-and-pasted from the blog posts.

Calling Bullshit: Data Reasoning in a Digital World. Fabulous syllabus from Carl T. Bergstrom and Jevin West. Check out Week 7! Week 7. Big data. When does any old algorithm work given enough data, and when is it garbage in, garbage out? Use and abuse of machine learning. Misleading metrics. Goodhart's law.

10 Ways Data Can Sabotage Your Teaching by Terry Heick. I really like the teacher's perspective in this great list from Terry Heick. See the article for examples and insights for each item: 1. The assessments are imprecise. 2. The inferences based on assessment result are limited or erroneous. 3. Assessment is infrequent. 4. The assessment is poorly-timed. 5. Data is dated. 6. ‘Depth of Knowledge’ isn’t factored. 7. Data is not transparent or accessible to others. 8. Data sources are not diverse. 9. Inflexible curriculum that resists data ‘absorption’. 10. There is too much data. 

There has been lots of reporting this week on Brian Goegan's speaking out at Arizona State University about mandated use of courseware, which also means surveillance, as @LibSkrat points out here:

Unethical numbers? A meta-analysis of library impact studies by M. Brooke Robertshaw and A. Asher. The next battleground: libraries. From the paper's abstract: "This paper presents the results of a meta-analysis of learning analytics studies in libraries that examine the effects of library use on measures of student success. Based on the aggregate results, we argue that outcomes of these studies have not produced findings that justify the loss of privacy and risk borne by students. Moreover, we argue that basing high- impact decisions on studies with no, or low, effect sizes, and weak correlation or regression values, has the potential to harm students, particularly those in already vulnerable populations."

EDUCAUSE Horizon Report: 2019 Higher Education Edition. All the hype is here of course: machine learning, AI, predictive analytics, etc. There is the occasional acknowledgment that there might be ethical concerns, but as you would expect, the hype is strong. Very strong. One very useful feature of the article is lots of hyperlinks for future reading.

I Used to Work for Google. I Am a Conscientious Objector. by Jack Poulson (in the NYTimes Privacy Project). This piece expresses exactly why I think it is so important for people to speak out: "The time has passed when tech companies can simply build tools, write algorithms and amass data without regard to who uses the technology and for what purpose."

How much are we sacrificing for automation? by S. A. Applin. This also is not education directly, but there are lots of warnings here for us about educational Taylorism too: Using counting, metrics, and implementation of outcomes from extreme data analysis to inform policies for humans is a threat to our well-being, and results in the stories we are hearing about in the warehouse, and in other areas of our lives, where humans are too often forfeiting their agency to algorithms and machines. And here's more from the article via Donna Lanclos:

And on the subject of mongering, rather than data, check out this post from Dr. Chuck: Why do People Like Sakai, given the Market Share? ... in particular, the PPS: "P.P.S. Instructure spent $135M last year on marketing and sales.  They took this money from the pockets of higher education and used it to convince more schools to give them more money."

Friday, April 26, 2019

Iyengar's Art of Choosing

I've copied and pasted this old post from my Canvas Community blog in order to reference it here.

I wrote a post last week about student choice as a course design principle, and I wanted to follow up on that today. In that post I said: I think it's wonderful that Instructure brought the awesome Sheena Iyengar to speak at InstructureCon (I read her book and loved it), but it would be even better if Instructure listened to what she said and respected student choice as an important element of course design.

It worries me that Canvas's commitment to simplicity can lead to oversimplification, which is not good for learning. (I'm a big believer in Carol Dweck's notion of making challenge the new comfort zone, Vygotsky's ZPD, etc.). In my experience, cognitive underload is a more clear and present danger than cognitive overload. Boredom, lack of engagement, lack of motivation, etc. etc. are the real problems I am struggling with. And one of the best strategies I have for combating boredom et al. is STUDENT CHOICE.

So, naturally I was very interested to read Iyengar's book, The Art of Choosing ... but I had no idea how much it would expand my understanding of choice; here are my Kindle highlights.

Let me start with a study I mentioned briefly in the previous post; this is an early study of hers that showed teacher-choice was the least motivating of the three learning conditions she tested: student-choice, mother-choice, teacher-choice. Among Anglo students, student-choice was the most motivating condition ("Anglo American children who were allowed to choose their own anagrams and markers solved four times as many anagrams as when Ms. Smith made their choices for them, and two and a half times more than when their mothers supposedly chose for them"), while for the Asian-American students, mother-choice was the most motivating ("the Asian American children performed best and were most motivated when they believed their mothers had chosen for them. These children solved 30 percent more anagrams than those who were allowed to choose their materials themselves, and twice as many anagrams as children who were assigned materials by Ms. Smith"). For both groups of students, though, teacher-choice was the least motivating condition. Yet teacher-choice is the dominant course design feature across the board in both K-12 and higher ed: teachers make the choices (what to read, what to study, what to write, etc.), instead of creating courses based on supporting and facilitating student choice.

Iyengar's book then goes into great detail about just it means to have a "choice" (real choice, free choice, meaningful choice), and I would really recommend that everybody read it. She is a great writer, and the narrative of her research is really compelling as she shows from one experiment to the next how she is driven by yet more subtle questions about choice. Each experiment answers some questions, but raises more questions in turn, and she is very attentive to cultural differences (something sadly lacking in a lot of education research), as you can see already in that early study above.

And, I am pleased to say, Iyengar's book was really transformative for me: I had previously looked at student choice as a practical strategy for motivating students; I wanted the students to choose what to read and what to write about because I hoped in that way they would be more engaged and would produce better work. More choice, better work, more learning... which would mean I could feel good about the job I was doing as a teacher. End of story.

But after reading Iyengar's book, I see things differently. I still know that choice is a very powerful strategy for motivating students (15 years of teaching tells me so), but Iyengar got me to see the question of choice as something of far greater importance, something existential, extending far beyond the classroom. And it is also something complex and even paradoxical, not straightforward or simple at all:
To be ourselves while remaining adaptable, we must either justify a decision to change as being consistent with our identity, or we must acknowledge that our identity itself is malleable but no less authentic for it. [...] One might say that we are trying to arrive at a state of homeostasis through a feedback loop between identity and choice.
That was a WOW moment for me in reading her book, identifying this back-and-forth between identity and choice and how there is a feedback loop there, which is also the source of the paradox: do I make my choices because of who I am? or am I who I am because of my choices? Or .... (cue Twilight Zone music) ... is it paradoxically both at once?

Much of Iyengar's book is taken up with this interplay of identity and choice, and also with the question of how HAPPINESS emerges through that interplay of identity and choice, and also the tension between constraint and creativity. Iyengar ends up being a strong advocate for choice, and that advocacy is based on a really deep understanding of people's choices as she has documented them in so many different experiments with so many different people from different cultures in different contexts.

Even more importantly: Iyengar is very away of the Pollyanna trap into which I think many teachers fall (I know I do), acting as if the choices we can offer in our classrooms can somehow compensate for the world of injustice in which we live:
At its best, choice is a means by which we can resist the people and the systems that seek to exert control over us. But choice itself can become oppressive when we insist that it is equally available to all. It can become an excuse for ignoring inequities that stem from gender or class or ethnic differences, for example, because one can blithely say, “Oh, but they had a choice! [...] As we saw in the very first chapter, the promise of choice, the language of choice, and even the mere illusion of choice have the power to motivate and uplift us. We should not, however, take this to mean that faith, hope, and rhetoric alone are sufficient.
So, with that incredibly important caveat in mind (and I've written elsewhere here about designing-for-equity), I want to close this post with a really cool exercise that Iyengar recommends for discovering motivation, the real intrinsic motivation, the motivation that is both your identity and your choice:
Try this for yourself. Write three versions of the story of your life (or a particular period in your life), looking in turn through the lenses of destiny, chance, and choice. [...] Which of these versions is most motivating for you? Which one encourages you to try harder, push further, reach higher? Which emphasizes that you have the power to go from where you are today to where you want to be tomorrow?
And here's something really cool you will see when you read the book: Iyengar starts the book with three such narratives about her own life: this woman practices what she preaches!

I read the book too late in the summer to be able to weave these ideas into my classes, but next summer I want to build in a new layer of writing based on this idea of "my story" (or stories!). My classes are already centered on storytelling, and I would really like to help my students find connections between their writing choices in the class and the life choices that they are making, seeing how all of those stories (the real ones and the fictional ones) emerge from that interplay of creativity and constraint.

Anyway, this post is already way too long... there is so much more that I would say, but if I have made you curious to go read the book, then I will consider this post a success. Thank you for reading! :-)

Sunday, April 21, 2019

Data-Mongering (2): Algorithms and Agency, GIGO, and more

This is my second of these round-ups; you can see them all here: Data-Mongering Round-Ups. I'm also annotating as I go in Diigo, so you can the Diigo files here: Data-Mongering articles. The editorial comments there are just copied-and-pasted from the blog posts.

It's depressing to keep on reading and learning about this, but especially now that I'm reading Shoshana Zuboff's The Age of Surveillance Capitalism, I can see that the shift is happening at Instructure just as it has in one company after another: under their new CEO, Instructure has realized that its "collateral" data can actually be commodified, turning user behavior into the product that is being sold (Soylent Canvas). Who would have thought that the main outcome of the digital revolution in education would be the triumphant return of behaviorism...? Eegad. Skinner would be so happy. And I am not.

Anyway, here are some of the things I read this week that made me stop and think:

Postdigital: 10 years later, Algorithms and Agency by Lawrie Phipps. This piece gets at so many of my deep concerns right now, and it's "looking back" perspective to 10 years ago shows what a dramatic shift there has been in the normalizing of expansive digital networks, both good and bad. For example, re: TurnItIn and their ilk: "Would the sector have been so fast to sign up to a plagiarism service 10 years ago, if they had known all the student IP would one day be the property of a publishing company?" I too was wildly naive 10 years ago, and guilty of some techno-evangelism I guess. I still love teaching online, but more and more I see the technological space in which I am supposed to do my work (Canvas) as being a threat, not a resource. quote "The naive utopia we described in our 2009 postdigital paper probably only exists in the minds of idealists and tech evangelists. People have designed digital tools, platforms, and other environments with political and financial motives. In our current postdigital world, digital does not serve the social, but through the manipulation of people, it is driving a particular kind of society, one that exploits the weaknesses and and fears of people; enables the rise of racism and xenophobia, and intensifying inequality."

Why ‘learning analytics’? Why ‘Learning Record Stores’? by Donald Clark. I don't always agree with Donald Clark, but I think he is spot on in his criticism of learning analytics hype here: "Perhaps the best use of data is dynamically, to create courses, provide feedback, adapt learning, text to speech for podcasts and so on. This is using AI in a precise fashion to solve specific learning problems. The least efficient use of data is storing it in huge pots, boiling it up and hoping that something, as yet undefined, emerges" (that last bit sounds just like the pie-in-the-sky claims by Instructure's CEO that just because they have lots of data they can get lots of use out of it). Specifically on AI and learning behavior: "Recording what people just ‘do’ is not that revealing if they are clickthrough courses, without much cognitive effort. Just showing them video, animation, text and graphics, no matter how dazzling is almost irrelevant if they have learnt little. This is a classic GIGO problem (Garbage In, Garbage Out)."

One Way to Reduce Gender Bias in Performance Reviews by Lauren Rivera and AndrĂ¡s Tilcsik. This is a fascinating piece at Harvard Business Review that warns us to be suspicious of any measurement because the measuring stick itself shapes the data in ways that we never realized or intended, like the way that women are more discriminated against if you use a 10-point rating scale as opposed to a 6-point scale. So, before we start measuring everything, we need to stand back and think about the prejudices that are going to inform/deform every supposedly objective measurement we make.

Institutions’ Use of Data and Analytics for Student Success by Amelia Parnell, Darlena Jones, Alexis Wesaw, and D. Christopher Brooks. This is part of an Educause research project, and it's a good reference point for the ways that schools are trying to use data to improve student success. It is such a slippery slope, and insofar as these systems rely on numbers, and grades in particular, I am dubious. My main concern, though, is the fact that in their eagerness to run their own data experiments, schools have given companies like Instructure way too much freedom to commodify and monetize that student data for purposes that go far beyond any local initiative. From this report, I learned that my school is not alone in a strong focus on first-year retention, and the report also shows that efforts are instead going to advising, tutoring, and counseling, which is again the case at my school. IMO we need to focus on the direct educational mission, in my opinion, not just on ancillary supports. Sadly, the report does not recommend strengthening faculty role or involvement, instead the recommendation is to "identify and expand institutionally appropriate roles for IR, IT, and student affairs." But there was also this bizarre quote plunked down in the middle of the discussion of admin and support services: "As algorithms become more sophisticated, there will increasingly be opportunities for faculty to become more engaged in the delivery of interventions." The one bright spot was this recommendation: Recommendation 4: Increase the use of qualitative data, especially from students. Yes, I say, yes! Student voices please!

Developing Minds in the Digital Age: Towards a Science of Learning for 21st Century Education. Big book (250 pages) from Patricia Kuhl et al. at OECD, which I learned about from Ben Williamson at Twitter. I haven't read the book yet; I was really struck, though, by the capitalization of Big Data and Artificial Intelligence as if they were gods or something; what's up with that??? This image is from Ben Williamson's tweet:

Anyway, the book looks useful, and I will give it a read this summer. 

Insurers Want to Know How Many Steps You Took Today by Mark Raymond. I already knew a lot of the content covered here in the NYTimes article from reading Cathy O'Neil's Weapons of Math Destruction, and the whole "health management" business shows just what dangers await us in the "learning management" business: "As machine learning works its way into more and more decisions about who gets coverage and what it costs, discrimination becomes harder to spot." College pricing is already a nightmare (sticker price, as it were, versus what individual students end up actually paying). That's just one nightmare scenario I can see playing out in future, as colleges become increasingly convinced, rightly or wrongly, that they can predict accurately just how students are going to perform (and creating self-fulfilling prophecies as a result of the biases that they institutionalize in this way...).

And from @YujieXuett, a screenshot (which of course I cannot read) of data-mongering in Chinese schools:

Sunday, April 14, 2019

#TotalCoLearner: a great semester of Hanuman learning

I wrote my first data-mongering round-up yesterday and did a lot of reading on predictive analytics in education... and ugh, it's all worse than I expected. But learning is good, and the better informed I become, the more I useful I can be in voicing opposition to this dehumanized education.

Meanwhile, before the weekend runs away, I wanted to write up something about the #TotalCoLearner experiment this semester in Indian Epics, because it has gone GREAT. I've got a series of #TotalCoLearner posts over at my Canvas blog, plus tweets, but this is my first post here about #TotalCoLearner, and it's perfect timing since I just wrote my last story of the semester yesterday, and I'll be wrapping up the class soon, finishing up early as some of my students do too.

So, what is #TotalColearner? The idea is that I do the whole course just like a regular student! That means you can see my course blog, just as the students each have their own blog, and I also have a course project website, just as the students do. One of the best things about all of this is that students comment on my blog and on my website at random just like they comment on the work of other students at random. And yes, they are surprised to find out that I am a student in the class, and it's kind of a weird surprise, but a good one; you can see their comments on my Introduction post this semester here. You can also see their comments on my project at the Comment Wall.

I keep track of my progress as the students do, although I use a spreadsheet instead of the Canvas Gradebook... and, honestly, I feel badly about how clunky and primitive the Canvas Gradebook is; the spreadsheet I use is way more easy to configure based on the different ways I want to check my progress (by date, by type of assignment, by my plan for finding up, etc.). For example, you can see here that as one of the few assignments I have left to do, there's one I should do today, which is writing up a famous last words post. I'll do that after I finish this post.

The nature of the course design means I really can do everything exactly as the students do; I don't have to "pretend" anything... I can just be myself. Admittedly, I'm not a typical student, but the whole point of my course design is that there is no "typical" student. Instead, every student shows up here with their own background and interests, their own skills and gaps, their own goals and priorities. Based on all that, each person is choosing what they want to read and write and other work that they do for the class, week by week, sharing their work via their blog and their website. It's because I can choose that I am able to adapt the class to suit my learning needs and goals, and the students are also doing the same thing for their own needs and goals.

The only difference between my work for the class and what the students are doing is that I can't do the weekly "project feedback" assignments because I do feedback already on all the projects every week as part of my job as a teacher. So, no worries: I just replace that assignment with other optional assignments, mixing-and-matching from the available assignments just like the students also do based on what assignments they choose to do (or not). This semester I did extra credit reading posts because I was reading a ton of stuff.

And as a result, oh my gosh, I LEARNED SO MUCH. And that's because I set myself a really cool and new challenge: I immersed myself in the version of the Ramayana from Thailand known as the Ramakien, and I also gave myself a crash course in the arts of Thailand that are inspired by Rama's story (Khon theater masks, temple sculptures, so much beautiful stuff). I've always known about the Ramakien's existence before, and I knew about Suvannamaccha, Hanuman's mermaid lover... but that's all. I had never read the whole thing. So, this semester, I read the whole thing! (In a terrible translation, but alas, there is no good English translation of the Ramakien.)

So, just like the students, I posted my reading notes week by week on the Ramakien (and I also re-read Chitra Divakaruni's Mahabharata novel, Palace of Illusions, plus I read Samhita Arni's new novel based on the Silappadikaram, a south Indian epic; more about that here). For Tech Tip extra credit, I built some randomizing image widgets with Hanuman art from India and from South Asia, and I even learned how to embed image randomizers into a Google Sites page! On the writing side, I also pushed myself in new ways so that the project I ended up writing was actually not an anthology (my usual writing approach), but a true extended narrative that ended up wrapping around at the end back to where I started, with a Ramakien-inspired story about Hanuman's mother to which I added my own big reveal at the end; that's the story I wrote yesterday: Hanuman and Pirakuan. Figuring out how that story would work may be the biggest writing thrill I've ever had... I'm so proud of figuring out how to bring Hanuman's mother back around into the story there at the end. (And, yes, that means my project ending up being centered on women's stories, which is a theme that comes up again and again in student projects in the class too, finding ways to decenter the men's stories so that we can bring more women's voices into the epic world.) I also wrote some stories here at the blog separate from the project that I was really proud of, especially my story about Mandodari, plus one about Arjuna and Hanuman.

So, for finishing up the class, I need to add that final Hanuman-and-Pirakuan story to my Storybook website, as well as adding another image gallery page to the site, along with some wrap-up posts at the blog, which I'll probably do next weekend. And while I will be sad to end this particular learning adventure (there is so much more I still want to learn about the south Asian Ramayana!), I am also so excited about what I will do for the Myth-Folklore class next semester. I've got a huge (HUGE) Brer Rabbit project that I began over winter break, and that is what I am going to use to ignite my participation in the Myth-Folklore class in the Fall.

And yay Brer Rabbit too! :-)

Saturday, April 13, 2019

Data-Mongering: Platform U and Other News of the Week

It's Saturday and part of me wants to write a happy blog post about open-ended pedagogy and colearning... and maybe I'll do that later (see #TotalCoLearner at Twitter), but I think I need to review the data-mongering articles I read this week. In fact, I might try to make this a weekly round-up of sorts, scrolling back through my Twitter feed and sharing links here. These are articles that I read this week; some are new, but some are old which I only now got around to reading.

And what is "Platform U" you might ask? Read on:

The platform university: a new data-driven business model for profiting from HE by Ben Williamson. This article discusses exactly what I see happening at Instructure, and why I am so unhappy about it: 
Despite studies repeatedly showing Turnitin’s high error rate, and considerable concern over the mistrust it creates while monetising students’ intellectual property, its acquisition clearly demonstrates huge market demand for data-driven HE platforms. It changes how students are valued—not for their independent intellectual development but as raw material to be mined for market advantage.

How Ed Tech Is Exploiting Students [premium at Chronicle of Higher Ed] by Chris Gilliard. This is an article from last year warning about the dangers discussed in Williamson's article, focusing specifically on the students' lack of consent in the exploitation of their data:
When we draft students into education technologies and enlist their labor without their consent or even their ability to choose, we enact a pedagogy of extraction and exploitation. It’s time to stop.

Colleges Are Banding Together Digitally to Help Students Succeed. Here’s How [premium at Chronicle of Higher Ed] by Alexander C. Kafka. A truly horrifying piece about Canvas data mining, this time in the context of the Unizin consortium (my school does not belong). This is exactly what Goldsmith at Instructure promised (Soylent Canvas), and now with endorsement from the educational administrators themselves: they really believe in this data nightmare. Sad to see Jared Stein quoted here; I guess the whole Instructure crew really is on board with the new predictive-analytics push where students are reduced to their clickstreams and pageviews (my thoughts on AI Overreach: students are more than the data they leave behind in an LMS!).
Take students’ clickstreams and pageviews on the learning-management system, their writing habits, their participatory clicks during classroom discussions, their grades. Then combine that with information on their educational and socioeconomic backgrounds, their status as transfer students, and so on. You end up with "a unique asset," says Wheeler, in learning what teaching methods work.

Counting the Countless: Why data science is a profound threat for queer people by Os Keyes. The observations here about the state apply to educational institutions also, and it is surely the most vulnerable students who are going to be hurt most by tracking based on predictive analytics driven by LMS data-mining:
So: trans existences are built around fluidity, contextuality, and autonomy, and administrative systems are fundamentally opposed to that. Attempts to negotiate and compromise with those systems (and the state that oversees them) tend to just legitimize the state, while leaving the most vulnerable among us out in the cold. This is important to keep in mind as we veer toward data science, because in many respects data science can be seen as an extension of those administrative logics: It’s gussied-up statistics, after all — the “science of the state.”

Margin of error in data-driven decisions by Robin De Rosa. This is a great piece on the gap between quantitative and qualitative data, especially in education... data has to be more than number-crunching!
When we ask whether there is evidence for something related to learning, we are presuming that we all agree 1) what learning is and 2) what constitutes evidence. I contend that “learning” is broader and messier than what we generally assess, and also that “evidence” has been reductively equated with quantification and with the assumption that environments in education are controlled. At the core, I think the biggest problem is that we forget that humans aren’t just giant brains walking around: we are also a jumble of social contexts, emotions, and circumstances.

10 technologies that will impact higher education the most this year by By Macy Bayern. Yep, predictive analytics, AI, nudges, it's all there. instead of a pull-quote, I will share Robin's tweet. What she said.

And this cartoon that Bob Calder shared with me is a great way to express how all this top-down data-mongering looks very different from the point of view of teachers and students who are being surveilled. The cartoon circulates in lots of languages, but I think it may have started with the Polish version (?):

He likes it!
Have fun playing, little one.

Sunday, April 7, 2019

Curating a Public Domain of Folklore and Mythology

I've been blogging about Canvas here over the past few weeks, but there are other/better/happier things to blog about, especially now that SUMMER is coming (the semester is over on May 4 for me!)... and summer means PROJECTS. More specifically it means PUBLIC DOMAIN projects, so I am going to write up a post here with some thoughts about what the public domain means for me in general, and more specifically what I hope to be doing this summer.

The Freebookapalooza

The public domain of printed books is, for me, the most important resource for teaching my classes, and it is also where I want to focus my efforts when I finally retire from my job (or get laid off... whichever comes first, ha ha). Thanks to the amazing resources at Hathi Trust, Internet Archive, Project Gutenberg, and Sacred Texts Archive, along with other online book projects, there is a wealth of material in the world of folklore and mythology that is available in the form of full-text books online. Mythology and folklore is a field that really lends itself to these public domain resources, and the main way in which I have been curating those public domain resources for folklore and mythology is at my Freebookapalooza.

The Freebookapalooza is just a simple blog where each post at the blog is about a book online, most of which (but not all) are public domain books. For each post, I include basic information about the book along with link(s) to the book online. I also include a table of contents so that, in addition to the book title, there are also the titles of chapters/stories in the book. Most of the books I am collecting are story collections, and having those titles can be really helpful in deciding just how useful the book might be for a specific purpose. I also include an image of some kind: the book cover, an illustration from the book, or some other image that is relevant to the book's contents. I use those images in the randomizers, like in the sidebar of this blog for example.

Right now, I have 1222 books posted at the blog. My goal, in honor of the year 2019, is to get to 2019 books by the end of the year, and I've got a little reminder script to keep me on track towards that goal. Right now I'm just a little bit ahead of schedule, but not by much, as you can see in this screenshot:

The Public Domain of 2019

The reason I wanted to expand the blog this year was because 2019 was a turning point for the public domain: this year books that were published with a 1923 copyright entered the public domain! Back in 1998 Congress extended the 75 year copyright term by another 20 years, which meant that there has been a long freeze on books entering the public domain. But now we are back on track, so that books from 1923 entered the public domain this year, and next year it will be books from 1924, and so on. Of course, there are books published in 1924 and later which are not restricted by copyright, some of which are even in the public domain, such as books published with a Creative-Commons-0 license, "no rights reserved." There are also other Creative Commons licenses, plus books which publishers put online as a public service while retaining the copyright; it's a whole big beautiful world of digital books out there! So, I include a whole range of full-text books online at the Freebookapalooza, but I focus my efforts on the public domain books which can be shared and reused freely, without any limitations.

Curation Strategies

There are lots of ways to think of the curation process, and I see my work in terms of these general tasks:

Description/Annotation. A book title provides a tiny bit of description, but readers need more. In addition to just knowing about the contents of the book, readers also need a heads up about the limitations of the book, especially when it comes to pre-1924 books in the public domain where there is pervasive racism, sexism, colonialism, etc. That is a big aspect of my Brer Rabbit project this summer, so I'll have more to say about that in future posts. As an example of great description and annotation, the late, great John Bruno Hare's prefatory notes for the books at Sacred Texts are a wonderful model. In so many ways, the Sacred Texts Archive has been an ideal and inspiration for me ever since I first got online back in 1998.

Navigation. There is also a certain clunkiness in working with things that come in book form, so just helping people navigate the books is part of the process, especially since these are not necessarily books that you read from cover to cover; instead, you might just be interested in reading a few selected stories from a given book. So, what I need are not just links to the books, but links that go directly to specific stories in those books (and once again Sacred Texts Archive took this path, with books broken up into separate webpages, one page per story, each directly addressable). Ultimately, a remix system would be great; I manually created the thousands of pages in my UnTextbook a few years ago, and it was a fun experiment, but for the next iteration, I want something more flexible. I've proved the UnTextbook can be a fantastic way to approach the reading for the class, with students choosing their own reading pathways: I would like to open that up even more and make it even more configurable by the readers.

Discoverability. I see discovery as taking place through browsing, randomness, and search. I really enjoy making randomizers for the books (I'm presenting on randomizers at Domains19, whoo-hoo!), and I would like to create environments that are good for browsing. Search is also a priority, and it is a real problem too. It helps to be able to search the story titles, but not all story titles are equally revealing. Full-text search works on the book level, but not so well across books, and it also depends the accuracy of the OCR (which ranges from excellent to abysmal; again, Sacred Texts Archive, along with Project Gutenberg, are invaluable as sources of truly digitized text). Ideally, I would be writing up short synopses for the stories that would facilitate searching, with some use of keywords and other forms of tagging.

Time, Time, Time

What's hard about projects like this is that there is never enough time to do all the things you want to do. Should I spend my time finding more books to catalog? Creating active links in the tables of contents? Writing annotations and synopses?

Luckily I enjoy all of these tasks tremendously, and I've been able to take a very casual, unplanned approach to all this work over the past years since it's really just been a side hobby, with most of my efforts focused on course design, not content.

Now, though, I need to start making some real choices. I feel like I've reached my goals with course design, so henceforth I will be focusing my hobby-time on this kind of work, and I want to end up with some products of real value to me and to others. In particular, I want to create a really good Brer Rabbit Resource Book that could be repurposed and even redesigned by the user for different audiences/contexts (Brer Rabbit in an American history class would look different than Brer Rabbit in an English literature class, etc.).

I also have an idea for a "1001 Public Domain Nights" or something like that where I will pick 1001 public domain story collections, choose just one story from each collection, and weave them together into a book where each story will somehow lead to the next story and so on through shared motifs and themes. Even better: a make-your-own 1001 Nights, where each story would have keys that link to other stories and you choose what you want next: another story with a lion? about vengeance? with a happy ending? etc.

And I still want to do Star Trek Aesop where I will retell Aesop's fables using characters (and animal species) from the Star Trek universe.

Yes, these are the kinds of nerdy fantasies that I have for my retirement, ha ha.

Anyway, for the next few weeks I will just keep on messing around... but when summer comes, I really want to start getting serious and thinking about priorities and possibilities so that I can make good use of my time and find the right technology tools to be using too. I've gone a long way with spreadsheets and blogs, but the time has come for a real database and some real cms.

And I'll close with this curation graphic from the ever-inspired Silvia Tolisano at Langwitches: Blogging as Curation. I am excited about my coming summer of curation! :-)