Notes for week 2 of my course on AI Agents for Managers are up! This week we talked about Retrieval-Augmented Generation and where it fits into businesses.
augmentedreturns.substack.com/p/ai-agents-...
@uclaanderson.bsky.social
Posts by Gregor Schubert
For more details:
Substack with summary: augmentedreturns.substack.com/p/the-larges...
WSJ article: www.wsj.com/tech/ai/the-...
Upcoming NBER presentation: www.nber.org/conferences/...
Thanks Julie Jargon for the article!
#EconSky @uclaanderson.bsky.social @stanfordhai.bsky.social
(n/n)
This matters for policy: while restricting AI use in some cases may benefit incumbent professionals, this should always be carefully weighed against consumer benefits of enabling MANY people to use these tools to make their lives better - see my quote in the article!
(7/n)
The paper has more detail: descriptives (e.g. which websites are used with chatbots?), how we identify causal effects (a new measure of household AI exposure!) and a structural model. But it's a simple story: GenAI can be good for non-workers too - and we should care!
(6/n)
The economic story is that productive online tasks get more efficient with AI - and households reallocate that time to other uses.
As this story from the WSJ says: when AI helps to get the "work" in our private lives done faster, it frees up time for true leisure.
(5/n)
We can see, importantly, what exactly they are doing right before and after they use a chatbot - the task context of AI use - and it's mostly productive things. So, AI is more likely to make online "productive" tasks more efficient than online leisure.
(4/n)
In our paper, we use internet browsing data to track when households use ChatGPT - and find that post-ChatGPT online browsing involves more leisure (social, gaming, streaming etc.), and less "productive" activities online (research, job search, shopping etc.)!
(3/n)
The WSJ article has some great anecdotes that align with our findings in the paper: while research has so far mostly focused on the labor market disruptions and on firms, GenAI use is common for private purposes - and can have equally large effects there!
(2/n)
🚨 New Working Paper! 🚨
Time to talk about "The Household Impact of GenAI", covered by the @WSJ !
In this paper, we ask what GenAI is doing to productivity outside of work - and find evidence of large impacts.
arxiv.org/pdf/2603.03144
w/ Miao Ben Zhang (USC) & Michael Blank (Stanford)
🧵(1/n)
More on our conference encouraging AI writing and also selecting papers using AI:
augmentedreturns.substack.com/p/launching-...
Please reach out to us if you have questions! Let us know what you think of this idea!
Also help us spread the word - we want to encourage a wide range of researchers, incl. grad students, or in industry, to submit!
#EconSky
Once the AI system has picked its choice of best papers, we will invite the (human) authors to present their AI-enabled work at a regular conference - with human discussants - at UCLA. Enabling this human exchange is still the goal - even if AI review might help get us there. 5/
We're excited to try and design AI-driven review systems for the Human x AI conference and see how they perform. We're also excited to see what finance researchers can produce if they "unleash the AI" on a compressed timeline. It's unclear - but that's the point of experimenting. 4/
So @mktmacrostruct Barney Hartman-Glaser and I decided to see what a system looks like where AI use is encouraged for both paper writing & paper review. It's possible that this makes academia worse - but we're hopeful it does not! In any case - it's not clear we get a choice. 3/
My experience at conferences these days is: researchers exchange advice about how to use AI more, editors worry about how to screen out AI slop, authors worry about referee AI use, referees worry about author AI use. And no one seems to know how to design a better system... 2/
What happens when we encourage AI use for research and also use it to review papers?
We are running an experiment to find out: the UCLA Human × AI Finance conference!
Write a finance paper with AI in 4 weeks (by 3/18). AI agents review the submissions:
humanxaifinance.org 🧵 1/
"Family firms" talks about various non-monetary benefits that owners can derive from firms - which presumably mean they don't optimize prices if those conflict with their other objectives. I guess Holmstrom multi-tasking + multiple objectives imply similar things?
Vibe-coding’s complement is unit testing. If the writing of code is commodified, the validation of output becomes scarce.
Randomized trial AI for legal work finds Reasoning models are a big deal:
Law students using o1-preview had the quality of work on most tasks increase (up to 28%) & time savings of 12-28%
There were a few hallucinations, but a RAG-based AI with access to legal material reduced those to human level
These findings have policy implications - groups that have a greater affinity for homeownership will be more likely to be benefited / harmed by credit or housing policy changes. Heterogeneity in take-up can result from group affinity!
Link to paper is here: papers.ssrn.com/sol3/papers....
#Econsky
We find non-RE wealth increases of HHs with high HO affinity limited to those in the right housing markets at the right time & who happen to see high price increases, NOT a general consequence of homeownership - so housing policy is limited if housing booms are not guaranteed!
We obtain restricted HRS data to see if affinity for HO impacts the portfolios of foreign-born retirees: as expected, they are more likely to own a home & own more RE in their portfolios. Total non-RE retirement wealth is also higher for those who have high HO affinity in origin country! But why?...
We show that AFFINITY matters for housing cycles and effects of credit supply shocks. High HO affinity households enter more into homeownership during 2000s housing boom, default less during the GFC - see paper for causal evidence on greater response to credit supply shocks.
It's hard to find exogenous changes in homeownership (HO) to study effects on HH finance. We build on literature looking at role of experiences/origins driving financial choices and show that HO in origin countries (HOCO) drives HO of foreign-born in the US! (15% passthrough)
🚨 New working paper with Caitlin Gorback!
We ask what happens when households are more likely to WANT to own a home for cultural reasons? We find homeownership increases, they're more responsive to credit supply shocks, and more of their retirement portfolios are in real estate. 🧵
Managers and leaders will need to update their beliefs for what Al can do, and how well it can do it, given these new Al models. Rather than assuming they can only do low-level work, we will need to consider the ways in which Al can serve as a genuine intellectual partner. These models can now tackle complex analytical tasks, creative work, and even research-level problems with surprising sophistication. The examples I've shared - from creating interactive 3D visualizations of academic concepts to performing PhD-level analysis - demonstrate that we're moving beyond simple automation into the realm of Al-powered knowledge work. These systems are still far from flawless, nor do they beat human experts consistently across a wide range of tasks, but they are very impressive.
A new generation of AIs: Claude 3.7 and Grok 3 www.oneusefulthing.org/p/a-new-generation-of-ai... #AI #Claude #Grok
Text Shot: I love this as an example of an expert using LLMs as tools to help further elevate their work. I'd love to read more examples like this one from experts in other fields.
A professional workflow for translation using LLMs - Simon Willison. simonwillison.net/2025/Feb/2/workflow-for-...
🧪 A @ai.nejm.org study found that GPT-4 translation of discharge notes significantly improved patient understanding, especially for Black, Hispanic, and older patients. AI-driven readability could help close health literacy gaps.
🩺💻
tagging: @downing.bsky.social @thelizarmy.bsky.social
I am worried LLM researchers sometimes bury the lede with regard to "should we trust these systems". Framing below is: LLMs are failing to "earn human trust". But it turns out it's the humans who cannot be trusted - even seeing the LLM's answer, the humans do worse than the LLM!
I think one way of describing these results to our future LLM overlords is “inference time compute scaling leads to thinking more but not to thinking better - in humans”