Really glad you wrote this post, Robert. One way I've used AI (Claude, specifically) is for Just in Time Teaching. My students consume content before class by reading or watching videos and they respond to a series of open-ended questions on the material they consumed. Just before class, I download their responses and feed them to Claude, asking the AI to summarize major themes. This helps me tailor what happens in class based on their collective strengths and weaknesses. This use of AI walks the line between providing constructive feedback and grading -- I'm not assigning grades based on the AI assessment (students get credit for completion, not correctness); rather, I'm using the AI's ability to quickly summarize and categorize 40 student responses so I can provide customized instruction that is tailored to this particular group of students.
I offered my Real Analysis students https://hallmos.com which is an NSF funded project that helps with student use case 2. They seemed to like it and then, when I asked if they had been using it, they had forgotten all about it and preferred in-class feedback.
On the privacy/FERPA issue, one way to handle that in theory is to run a LLM locally on your machine vs. using a web-based product. It's worth a couple of hours of experimenting to download something like LMStudio and to download a few different models. Some of these models have their own interesting quirks in their outputs that you might prefer. However, most people will discover that consumer laptops and desktops don't have any oomph so this won't be practical on a regular basis. The interaction is very laggy. A gaming machine gets closer to reproducing the web-based experience. Hopefully, providers will eventually develop locally installed models and interfaces that work on the most common commercial machines.
I recently discovered about how Claude can create web apps like the ones in your Student use case #1 - super cool! Though in my experience, I have had to iterate at least a few times before the practice exercises or the interactive practice activity is how I like it, so I’ve been thinking more about creating these kinds of apps for my students rather than sending them to ai to make one. I even had Claude walk me through how to set up a GitHub repository so I could make the .jsx files go live (I’m working on putting together a game where you learn propositional logic here: https://jenellegloria.github.io/logic-games/ )
It’s so tough. I think there are very real concerns about AI and thinking skills but I also think there’s promise. I don’t feel great about the idea of using it to make high-level decisions like decisions about which course standards to set, but that’s really fascinating that the ones it came up with were similar to the ones you’d already set. Especially since you’re an expert having done it for so long. That could be part of my own hesitation, too - it feels important for me in my position (first semester doing SBG) to make those high-level decisions myself, maybe even to do it imperfectly, as I grow my own judgment and teaching practice.
That's super cool about the GitHub repo. I honestly didn't even think about actually making the practice apps executable live on the web -- sometimes I think my thinking is too small around these things. I'm going to have to try this myself!
I think the proper mindset around using these tools is, you don't let AI make decisions FOR you, but you use it in an advisory role. In the case of standards, I think the right approach is first come up with what YOU think are the right standards, then use the tools to get feedback on your work, then you make changes or improvements if you think there are any to be made, based on the feedback. It's just another instance of the feedback loops we talk about here all the time. You'd use a human to do the same thing (not write your syllabus for you but to read what you have and give you feedback), and in fact a combo of human and AI feedback seems like it would be better feedback than either one individually. Then over time you build better judgment and aren't "relying" on tools so much.
I just wonder if I’d really build better judgment that way or just a good simulacrum of it. I wonder if it’s important to actually fail on the path to learning. Like to try something and SEE it not working rather than Claude telling me that won’t work. If we could use AI to make everything perfect on the first go-round would we have lost something, too? I truly don’t know. But I wonder about it.
I think engaging in a feedback loop produces real judgment. What I described here is not taking an AI's word as authoritative or final. It's just feedback and you as the human still have the responsibility to make sense of it and plan from it. We treat human feedback the same way. We don't let a desire to experience things for ourselves prevent us from talking to colleagues or friends and getting their inputs on things.
I too am convinced of the utility of generative AI in Academia when used appropriately, and have been personally using Gemini to speed up the creation of STACK mathematics questions and help me with some workflow optimisations.
However, I've been reading Empires of AI by Karen Hao and it is really giving me pause about using these LLMs at all — particularly reading about the exploitation of human workers for RLHF in training the models and the environmental and societal impact, particularly in the global south, of the relentless quest for scale in the construction and operation of ever-larger data centres. If you haven't read this book yet, I would highly recommend it.
For me it’s about knowing where to draw the line at non-use of a technology. If I stop using AI out of a legitimate issue with environmental impact, for example, should I then stop using any technology with a similar track record? For example am I also prepared to stop using airplanes, or air conditioners, or fossil fuels in general? If I am concerned about human exploitation am I prepared to stop buying any products from countries where workers are exploited or where human rights violations are egregious? Is my concern just virtue signaling or am I ready to take it the entire distance and apply the disengagement equally to all similar situations?
Understand these are questions for myself only. Other people are free to work these questions out for themselves possibly with different premises and with different results.
I'd be interested to hear your thoughts on the book as a fellow maths/teaching/tech enthusiast if you read it.
These are valid points you make, and things we have to contend with all the time in the society that we find ourselves in. What took me aback reading the book was the sheer scale of the industry and its growth rate which, for me, may put it in a different category from the other technologies we know and love (or hate).
Agree that the choice of where to draw the line is a personal one, but I wouldn't characterise concern one might have about a specific technology without concomitant disengagement from other tech as necessarily virtue signalling or hypocritical — that is more to do with how you communicate about it.
My thoughts on all of this aren't well-formed yet, my feelings on it even less so, and my personal line yet to be drawn :)
Thanks for sharing some great ideas. What we see is that most faculty members look at it from a "teacher perspective" and often completely miss out the "student perspective". You call them "faculty" and "student" use cases. - Do you know some sources, where this is discussed? GenAI is in our point of view most disruptive when you look at it from the student perspective, there is the big impact and also to even bigger opportunity to step in with alternative grading ideas.
Honestly I think those must be out there but I don't have a curated list. I've been trying to view genAI as a tool for deliberate practice but I haven't seen much of anything in terms of a systematic treatment of the tech from that lens. It actually makes me wonder if there would be an interesting controlled study in there somewhere -- whether a mindful use of AI improves deliberate practice. Surely I can't be the first person to think of this though.
Really glad you wrote this post, Robert. One way I've used AI (Claude, specifically) is for Just in Time Teaching. My students consume content before class by reading or watching videos and they respond to a series of open-ended questions on the material they consumed. Just before class, I download their responses and feed them to Claude, asking the AI to summarize major themes. This helps me tailor what happens in class based on their collective strengths and weaknesses. This use of AI walks the line between providing constructive feedback and grading -- I'm not assigning grades based on the AI assessment (students get credit for completion, not correctness); rather, I'm using the AI's ability to quickly summarize and categorize 40 student responses so I can provide customized instruction that is tailored to this particular group of students.
Great use case for flipped instruction!
I offered my Real Analysis students https://hallmos.com which is an NSF funded project that helps with student use case 2. They seemed to like it and then, when I asked if they had been using it, they had forgotten all about it and preferred in-class feedback.
I'd never heard of that tool before, so thanks! Also it's really weird how some tech catches on while others don't.
On the privacy/FERPA issue, one way to handle that in theory is to run a LLM locally on your machine vs. using a web-based product. It's worth a couple of hours of experimenting to download something like LMStudio and to download a few different models. Some of these models have their own interesting quirks in their outputs that you might prefer. However, most people will discover that consumer laptops and desktops don't have any oomph so this won't be practical on a regular basis. The interaction is very laggy. A gaming machine gets closer to reproducing the web-based experience. Hopefully, providers will eventually develop locally installed models and interfaces that work on the most common commercial machines.
I recently discovered about how Claude can create web apps like the ones in your Student use case #1 - super cool! Though in my experience, I have had to iterate at least a few times before the practice exercises or the interactive practice activity is how I like it, so I’ve been thinking more about creating these kinds of apps for my students rather than sending them to ai to make one. I even had Claude walk me through how to set up a GitHub repository so I could make the .jsx files go live (I’m working on putting together a game where you learn propositional logic here: https://jenellegloria.github.io/logic-games/ )
It’s so tough. I think there are very real concerns about AI and thinking skills but I also think there’s promise. I don’t feel great about the idea of using it to make high-level decisions like decisions about which course standards to set, but that’s really fascinating that the ones it came up with were similar to the ones you’d already set. Especially since you’re an expert having done it for so long. That could be part of my own hesitation, too - it feels important for me in my position (first semester doing SBG) to make those high-level decisions myself, maybe even to do it imperfectly, as I grow my own judgment and teaching practice.
That's super cool about the GitHub repo. I honestly didn't even think about actually making the practice apps executable live on the web -- sometimes I think my thinking is too small around these things. I'm going to have to try this myself!
I think the proper mindset around using these tools is, you don't let AI make decisions FOR you, but you use it in an advisory role. In the case of standards, I think the right approach is first come up with what YOU think are the right standards, then use the tools to get feedback on your work, then you make changes or improvements if you think there are any to be made, based on the feedback. It's just another instance of the feedback loops we talk about here all the time. You'd use a human to do the same thing (not write your syllabus for you but to read what you have and give you feedback), and in fact a combo of human and AI feedback seems like it would be better feedback than either one individually. Then over time you build better judgment and aren't "relying" on tools so much.
I just wonder if I’d really build better judgment that way or just a good simulacrum of it. I wonder if it’s important to actually fail on the path to learning. Like to try something and SEE it not working rather than Claude telling me that won’t work. If we could use AI to make everything perfect on the first go-round would we have lost something, too? I truly don’t know. But I wonder about it.
I think engaging in a feedback loop produces real judgment. What I described here is not taking an AI's word as authoritative or final. It's just feedback and you as the human still have the responsibility to make sense of it and plan from it. We treat human feedback the same way. We don't let a desire to experience things for ourselves prevent us from talking to colleagues or friends and getting their inputs on things.
I too am convinced of the utility of generative AI in Academia when used appropriately, and have been personally using Gemini to speed up the creation of STACK mathematics questions and help me with some workflow optimisations.
However, I've been reading Empires of AI by Karen Hao and it is really giving me pause about using these LLMs at all — particularly reading about the exploitation of human workers for RLHF in training the models and the environmental and societal impact, particularly in the global south, of the relentless quest for scale in the construction and operation of ever-larger data centres. If you haven't read this book yet, I would highly recommend it.
Thanks, I will add this to the queue.
For me it’s about knowing where to draw the line at non-use of a technology. If I stop using AI out of a legitimate issue with environmental impact, for example, should I then stop using any technology with a similar track record? For example am I also prepared to stop using airplanes, or air conditioners, or fossil fuels in general? If I am concerned about human exploitation am I prepared to stop buying any products from countries where workers are exploited or where human rights violations are egregious? Is my concern just virtue signaling or am I ready to take it the entire distance and apply the disengagement equally to all similar situations?
Understand these are questions for myself only. Other people are free to work these questions out for themselves possibly with different premises and with different results.
I'd be interested to hear your thoughts on the book as a fellow maths/teaching/tech enthusiast if you read it.
These are valid points you make, and things we have to contend with all the time in the society that we find ourselves in. What took me aback reading the book was the sheer scale of the industry and its growth rate which, for me, may put it in a different category from the other technologies we know and love (or hate).
Agree that the choice of where to draw the line is a personal one, but I wouldn't characterise concern one might have about a specific technology without concomitant disengagement from other tech as necessarily virtue signalling or hypocritical — that is more to do with how you communicate about it.
My thoughts on all of this aren't well-formed yet, my feelings on it even less so, and my personal line yet to be drawn :)
Thanks for sharing some great ideas. What we see is that most faculty members look at it from a "teacher perspective" and often completely miss out the "student perspective". You call them "faculty" and "student" use cases. - Do you know some sources, where this is discussed? GenAI is in our point of view most disruptive when you look at it from the student perspective, there is the big impact and also to even bigger opportunity to step in with alternative grading ideas.
Honestly I think those must be out there but I don't have a curated list. I've been trying to view genAI as a tool for deliberate practice but I haven't seen much of anything in terms of a systematic treatment of the tech from that lens. It actually makes me wonder if there would be an interesting controlled study in there somewhere -- whether a mindful use of AI improves deliberate practice. Surely I can't be the first person to think of this though.