Startup owner using AI with this need - needless to say, a real problem. I've considered DIYing an internal service for this - even if we went with you we'd probably have an intern do a quick and dirty copy, which I rarely advocate for if I can offload to SAAS. I'm sure you've put a fair bit of work into this that goes well beyond the human interaction loop, but that's really all we need. Your entry price is steep (I'm afraid to ask what an enterprise use-case looks like) and this isn't complicated to make. We don't need to productize or have all the bells and whistles - just human interaction occasionally. Any amount of competition would wipe out your pricing, so no I would not want to pay for this.
thanks for the validation of the problem! totally open to feedback about the solution, and totally get that you only need something simple for now. I want to point out that we do have a pay-as-you-go tier which is $20 for 200 operations, and have a handful of indie devs finding this useful for back-office style automations.
ALSO - something I think about a lot - if a all/most of the HumanLayer SaaS backend was open source, would that change your thinking?
My gut feeling is with where we're headed we'll clear that 200 pretty quickly in production cases, so we'd be interested in bit higher volume. Our dev efforts would probably clear that 200/mo. If the flow/backend was open-source that'd be a total game changer for us as I see it as an integral part of our product.
edit: I want to add here that while ycomb companies like yourself may have VC backing, a lot of us don't and do consider 500+/mo. base price on a service that is operations-limited to be a lot. You need to decide who your target audience is, I may not be in that audience for your SAAS pricing. This seems like a service that a lot of people need, but it also stands out to me as a service that will be copied at an extravagantly lower price. We have truly entered software as a commodity when I, a non-AI engineer, can whip up something like this in a week using serverless infra and $0.0001/1k tokens with gpt-o mini.
that makes sense - and have wondered a lot even more generally about the price of software and what makes a hard problem hard. Like Amjad from Replit said on a podcast recently "can anyone build 'the next salesforce' in a world where anyone can build their own salesforce with coding agents and serverless infra"
I think in building this some of the things that folks decided they don't want to deal with is like, the state machine for escalations/routing/timeouts, and infrastructure to catch inbound emails and turn them into webhooks, or stitch a single agent's context window with multiple open slack threads, but you're right, that can all be solved by a software engineer with enough time and interest in solving the problem.
I will need to clear up the pricing page as it sounds like I didn't do a good job (great feedback thank you!) - it's basically $20/200 credits, and you can pay-as-you-go, and re-up for more whenever you want. We are early and delivering value is more important to me than extracting every dollar, especially out of a fellow founder who's early. If you geniunely find this useful, I would definitely chat and collaborate/partner to figure out something you think is fair, where you're getting value and you get to focus on your core competency. feel free to email me dexter at humanlayer dot dev
I’m just armchair quarterbacking here but I feel like you should just do all features for every user with a single $/action rate, then give discounts for volume and/or prepayment. Even saying $20/200 is a clunky statement. You could just say $0.10 per action (the fact that you’re actually requiring me to make a $20 payment with a $20 charge once it gets to $10 or something like that isn’t even important to me on a pricing page, although when you mention it later in the billing page make sure you also tell people it’s a risk-free money back guarantee if that’s the case)
If there’s something that truly has an incremental cost to you, like providing priority support, that goes into the “enterprise pricing” section and you need to figure out how to quote that separately from the service. My guess is most people don’t want to pay extra for that, or perhaps they’d pay for some upfront integration support but ongoing support is not too important to them. Idk, that’s just my guess here.
I knew this was coming, so kudos to you all for getting out of the gate!
I've implemented this in our workflows, albeit a bit more naive: when we kick off new processes the user is given the option to "put a human in the loop" -- at which point processing halts and a user/group is paged to review the content in flight, along with all the chains/calls.
The human can tweak the text if needed and the process continues.
I was wondering: have you thought about automation bias or automation complacency [0]? Sticking with the drop-tables example: if you have an agent that works quite well, the human in the loop will nearly always approve the task. The human will then learn over time that the agent "can be trusted", and will stop reviewing the pings carefully. Hitting the "approve" button will become somewhat automated by the human, and the risky tasks won't be caught by the human anymore.
this is fascinating and resonates with me on a deep level. I'm surprised I haven't stumbled across this yet.
I think we have this problem with all AI systems, e.g. I have let cursor write wrong code from time to time and don't review it at the level I should...we need to solve that for every area of AI. Not a new problem but definitely about to get way more serious
the dystopian startups that use bounding boxes to observe workers in a warehouse and give the boss a report on how many breaks they took...they're here
Oh man, the API call for hl.human_as_tool() is a little ominous. Obviously approving a slack interaction is no big deal, but it does have a certain attitude towards humans that doesn't bode well for us...
P.S. nobody asked but since you made it this far - the next big problem in this space is fast becoming, what else do we need to be able to build these "headless" or "outer loop" AI agents? Most frameworks do a bad job of handling any tool call that would be asynchronous or long running (imagine an agent calling a tool and having to hang for hours or days while waiting for a response from a human). Rewiring existing frameworks to support this is either hard or impossible, because you have to
1. fire the async request,
2. store the current context window somewhere,
3. catch a webhook,
4. map it back to the original agent/context,
5. append the webhook response to the context window,
6. resume execution with the updated context window.
I have some ideas but I'll save that one for another post :) Thanks again for reading!
I'm considering this for a workflow agent and would be keen to hear thoughts on this process.
We're a medical device company, so we need to do ISO13485 quality assurance processes on changes to software and hardware.
I had already been thinking of using an LLM to help ensure we are surfacing all potential concerns and ensure they are addressed. Partly relying on the LLM, but really as a method to manage the workflow and confirm that our processes are being followed.
Any thoughts on if this might be a good solution? Or other suggestions by other HN users.
Looking forward to playing with HumanLayer. The slack integration looks a lot more useful for my workflows than other tools I've tried.
In the demo video and example, you show faked LinkIn messages integration. Do you have any recommendations for tools that can actually integrate with live LinkedIn messages?
thanks for sharing your experience so far! Like I said, we built this ourselves for another idea and it was painful.
I have played with Make and I actually chatted w/ the gotohuman guy on zoom a while back, I like his approach as well, he went straight to webhooks which makes sense for big production use cases
re: LinkedIn, no I don't know how to get agents to integrate with linkedin. I have tried a bunch of things, I know of some YC companies that tried this but I don't know how it went for them. Best I have gotten is using stagehand/dendrite with browserbase to do it with a browser, and then using humanlayer human_as_tool to ping me if it needs an MFA token or other inputs
Thanks for the reply! I've used a bunch of grey market 3rd party tools for LinkedIn automation. Most of them have some sort of API. I'll try integrating with HumanLayer.
so I played with MCP for a while last night and I think MCP is great as a layer to pull custom tools into the existing claude/desktop/chat experience. But at the end of the day its just basic agentic loop over tool calls.
If you want to tell a model to send a message to slack, sure, give it a slack tool and let it go wild. do you see a way how MCP applies for outer-loop or "headless" agents in a way that's any different from another tool-calling agent like langchain or crewai? IT seems like just another protocol for tool calling over the stdio wire (WHICH, TO BE CLEAR, I FIND SUPER DOPE)
Congrats on the launch, this is an interesting concept. It's somewhat akin to developers approving LLM generated code changes and pull requests. I feel much more comfortable with senior developers approving AI changes to our codebase, then letting loose an autonomous agent with no human oversight.
super relevant - yeah I think it was someone at anthropic who framed this as "cursor tab autocomplete, but for arbitrary API calls" - basically for everything else other than code
My favorite part of all this is that it’s inevitable. Someone has to solve agent adoption in whatever-the-environment-already-is. And nobody is doing this well at scale. Europe is mandating this. And even though Article 14 of the AI Act won’t be enforced until 2026, I’m glad projects like this are working ahead. Get after it, Dex!
Congrats Dex! Excited to see what people build with this + tools like Stripe's new agent payments SDK (issuing a payment seems like a great place to ask permission).
congrats on the launch dex! this is a problem that i've already seen come up a dozen times and many companies are building it internally in a variety of different ways. easier to buy vs. build for something like this imo, glad its being built!
What I don't understand from quickly skimming your description and homepage: Do you source/provide the humans in the loop? That's a good value add, but how do I automatically / manually vet how you do the routing?
great question - yeah i was actually heavily inspired by people trying to figure that stuff out on reddit back in july, and realizing that mapping that human input across slack, email, sms was never going to be a core focus for those agent frameworks
I work in operations/finance. I've experimented with integrating LLMs into my workflow. I would not feel comfortable 'handing the wheel' to an LLM to make actions autonomously. Something like this to be able to approve actions in batches, or approve anything external facing would be useful.
hah thanks dude! I am very bullish on TS as the long term thing, Not to turn this into a language vs language thread but I spend a lot of time thinking about why ppl struggle so much with python...so far I came up with
concurrency abstractions keep changing (still transitioning / straddling sync+threads vs. asyncio) - this makes performance eng really hard
package management somehow less mature than JS - pip been around way longer than npm but JS got yarn/lockfiles before python got poetry
the types are fake (also true of typescript, I think this one is a wash)
the types are fake and newer. typing+pydantic is kinda bulky vs. TS having really strong native language support (even if only at compile time)
virtual environments!?! cmon how have we not solved this yet
wtf is a miniconda
VSCode has incredible TS support out of the box, python is via a community plugin, and not as many language server features
This is the first new YC launch I've seen involving AI that I am extremely positive about. I have worked with systems implementing similar functionality ad-hoc already, but seeing it as a buy-in service - and one so easy to integrate - is really cool.
From what I've seen, this will bring the implementation needs for this kind of functionality down from "engineering team" to a single programmer.
Looks amazing! (Also, I've known Dexter since before Human Layer and he's a force of nature. If you think this is interesting now, you're going to be amazed at where it goes)
Just an idea: having a little widget in the MacOS menu bar that pops up or sends you a notification to solve a human task wouldn't be so terrible either.
that's a great idea - I put together one example for getting an MFA code for a website, but the captcha thing "pull a human into a web session" is something I've wanted to play with for a while
Hiring humans to do a consistent job is gonna be a nightmare and a limit on the scalability of the service. How are you defining your service level agreements?
Startup owner using AI with this need - needless to say, a real problem. I've considered DIYing an internal service for this - even if we went with you we'd probably have an intern do a quick and dirty copy, which I rarely advocate for if I can offload to SAAS. I'm sure you've put a fair bit of work into this that goes well beyond the human interaction loop, but that's really all we need. Your entry price is steep (I'm afraid to ask what an enterprise use-case looks like) and this isn't complicated to make. We don't need to productize or have all the bells and whistles - just human interaction occasionally. Any amount of competition would wipe out your pricing, so no I would not want to pay for this.
thanks for the validation of the problem! totally open to feedback about the solution, and totally get that you only need something simple for now. I want to point out that we do have a pay-as-you-go tier which is $20 for 200 operations, and have a handful of indie devs finding this useful for back-office style automations.
ALSO - something I think about a lot - if a all/most of the HumanLayer SaaS backend was open source, would that change your thinking?
My gut feeling is with where we're headed we'll clear that 200 pretty quickly in production cases, so we'd be interested in bit higher volume. Our dev efforts would probably clear that 200/mo. If the flow/backend was open-source that'd be a total game changer for us as I see it as an integral part of our product.
edit: I want to add here that while ycomb companies like yourself may have VC backing, a lot of us don't and do consider 500+/mo. base price on a service that is operations-limited to be a lot. You need to decide who your target audience is, I may not be in that audience for your SAAS pricing. This seems like a service that a lot of people need, but it also stands out to me as a service that will be copied at an extravagantly lower price. We have truly entered software as a commodity when I, a non-AI engineer, can whip up something like this in a week using serverless infra and $0.0001/1k tokens with gpt-o mini.
If your use would be 500/mo, you’d just pay them $40 or $60 per month.
that makes sense - and have wondered a lot even more generally about the price of software and what makes a hard problem hard. Like Amjad from Replit said on a podcast recently "can anyone build 'the next salesforce' in a world where anyone can build their own salesforce with coding agents and serverless infra"
I think in building this some of the things that folks decided they don't want to deal with is like, the state machine for escalations/routing/timeouts, and infrastructure to catch inbound emails and turn them into webhooks, or stitch a single agent's context window with multiple open slack threads, but you're right, that can all be solved by a software engineer with enough time and interest in solving the problem.
I will need to clear up the pricing page as it sounds like I didn't do a good job (great feedback thank you!) - it's basically $20/200 credits, and you can pay-as-you-go, and re-up for more whenever you want. We are early and delivering value is more important to me than extracting every dollar, especially out of a fellow founder who's early. If you geniunely find this useful, I would definitely chat and collaborate/partner to figure out something you think is fair, where you're getting value and you get to focus on your core competency. feel free to email me dexter at humanlayer dot dev
I’m just armchair quarterbacking here but I feel like you should just do all features for every user with a single $/action rate, then give discounts for volume and/or prepayment. Even saying $20/200 is a clunky statement. You could just say $0.10 per action (the fact that you’re actually requiring me to make a $20 payment with a $20 charge once it gets to $10 or something like that isn’t even important to me on a pricing page, although when you mention it later in the billing page make sure you also tell people it’s a risk-free money back guarantee if that’s the case)
If there’s something that truly has an incremental cost to you, like providing priority support, that goes into the “enterprise pricing” section and you need to figure out how to quote that separately from the service. My guess is most people don’t want to pay extra for that, or perhaps they’d pay for some upfront integration support but ongoing support is not too important to them. Idk, that’s just my guess here.
I knew this was coming, so kudos to you all for getting out of the gate!
I've implemented this in our workflows, albeit a bit more naive: when we kick off new processes the user is given the option to "put a human in the loop" -- at which point processing halts and a user/group is paged to review the content in flight, along with all the chains/calls.
The human can tweak the text if needed and the process continues.
Interesting tool, congrats on the launch!
I was wondering: have you thought about automation bias or automation complacency [0]? Sticking with the drop-tables example: if you have an agent that works quite well, the human in the loop will nearly always approve the task. The human will then learn over time that the agent "can be trusted", and will stop reviewing the pings carefully. Hitting the "approve" button will become somewhat automated by the human, and the risky tasks won't be caught by the human anymore.
[0]: https://en.wikipedia.org/wiki/Automation_bias
this is fascinating and resonates with me on a deep level. I'm surprised I haven't stumbled across this yet.
I think we have this problem with all AI systems, e.g. I have let cursor write wrong code from time to time and don't review it at the level I should...we need to solve that for every area of AI. Not a new problem but definitely about to get way more serious
This is a great idea- I hope that you are wildly successful.
I’m an AI skeptic mostly because I see people rushing to connect unreasoning LLMs to real things and as a result cause lots of problems for humans.
I love the idea of human-in-the-loop-as-a-service because at least it provides some sort of safety net for many cases.
Good luck!
We must do whatever we can to stay above the API:
https://www.johnmacgaffey.com/blog/below-the-api/
I wonder if we can stay above the API if we manage to stay in control of the prompt.
Prompt == "incentive" for the AI, we are still the boss but the AI is just an underling coming to us with TPS reports.
That was a very interesting read, thanks!
Great article, agreed. I don't want to work for a company where algorithms are weaponized against me.
the dystopian startups that use bounding boxes to observe workers in a warehouse and give the boss a report on how many breaks they took...they're here
Oh man, the API call for hl.human_as_tool() is a little ominous. Obviously approving a slack interaction is no big deal, but it does have a certain attitude towards humans that doesn't bode well for us...
P.S. nobody asked but since you made it this far - the next big problem in this space is fast becoming, what else do we need to be able to build these "headless" or "outer loop" AI agents? Most frameworks do a bad job of handling any tool call that would be asynchronous or long running (imagine an agent calling a tool and having to hang for hours or days while waiting for a response from a human). Rewiring existing frameworks to support this is either hard or impossible, because you have to
1. fire the async request, 2. store the current context window somewhere, 3. catch a webhook, 4. map it back to the original agent/context, 5. append the webhook response to the context window, 6. resume execution with the updated context window.
I have some ideas but I'll save that one for another post :) Thanks again for reading!
The MCP[1] that was announced by Anthropic has a solution to this problem, and it's pretty good at handling this use case.
I've also been working on a solution to this problem via long-polling tools.
[1] https://github.com/modelcontextprotocol
Temporal makes this easy and works great for such use cases. It's what I'm using for my own AI agents.
ah very cool! are there any things you wish it did or any friction points? What are the things that "just work"?
I'm considering this for a workflow agent and would be keen to hear thoughts on this process.
We're a medical device company, so we need to do ISO13485 quality assurance processes on changes to software and hardware.
I had already been thinking of using an LLM to help ensure we are surfacing all potential concerns and ensure they are addressed. Partly relying on the LLM, but really as a method to manage the workflow and confirm that our processes are being followed.
Any thoughts on if this might be a good solution? Or other suggestions by other HN users.
Proud to have helped edit an earlier draft of this — go Dexter go!
Congrats on the launch! Human in the loop is an underserved market for AI toolchains. I've usually had to build custom tools for this which is a PITA.
Make.com has a human in the loop feature in closed beta. https://www.make.com/en/help/app/human-in-the-loop
There's also https://www.gotohuman.com/ that uses review forms.
Looking forward to playing with HumanLayer. The slack integration looks a lot more useful for my workflows than other tools I've tried.
In the demo video and example, you show faked LinkIn messages integration. Do you have any recommendations for tools that can actually integrate with live LinkedIn messages?
thanks for sharing your experience so far! Like I said, we built this ourselves for another idea and it was painful.
I have played with Make and I actually chatted w/ the gotohuman guy on zoom a while back, I like his approach as well, he went straight to webhooks which makes sense for big production use cases
re: LinkedIn, no I don't know how to get agents to integrate with linkedin. I have tried a bunch of things, I know of some YC companies that tried this but I don't know how it went for them. Best I have gotten is using stagehand/dendrite with browserbase to do it with a browser, and then using humanlayer human_as_tool to ping me if it needs an MFA token or other inputs
Thanks for the reply! I've used a bunch of grey market 3rd party tools for LinkedIn automation. Most of them have some sort of API. I'll try integrating with HumanLayer.
i am gonna talk with the guy who made trykondo.com this week, I think he has a lot of experience in that area too
Definitely a problem that everyone needs to solve.
I wonder if you can achieve this workflow by just using prompt and the new Model Context Protocol connected to email / slack.
https://www.anthropic.com/news/model-context-protocol
so I played with MCP for a while last night and I think MCP is great as a layer to pull custom tools into the existing claude/desktop/chat experience. But at the end of the day its just basic agentic loop over tool calls.
If you want to tell a model to send a message to slack, sure, give it a slack tool and let it go wild. do you see a way how MCP applies for outer-loop or "headless" agents in a way that's any different from another tool-calling agent like langchain or crewai? IT seems like just another protocol for tool calling over the stdio wire (WHICH, TO BE CLEAR, I FIND SUPER DOPE)
Congrats on the launch, this is an interesting concept. It's somewhat akin to developers approving LLM generated code changes and pull requests. I feel much more comfortable with senior developers approving AI changes to our codebase, then letting loose an autonomous agent with no human oversight.
super relevant - yeah I think it was someone at anthropic who framed this as "cursor tab autocomplete, but for arbitrary API calls" - basically for everything else other than code
My favorite part of all this is that it’s inevitable. Someone has to solve agent adoption in whatever-the-environment-already-is. And nobody is doing this well at scale. Europe is mandating this. And even though Article 14 of the AI Act won’t be enforced until 2026, I’m glad projects like this are working ahead. Get after it, Dex!
Congrats Dex! Excited to see what people build with this + tools like Stripe's new agent payments SDK (issuing a payment seems like a great place to ask permission).
wow I'm so glad you asked cuz i just shipped this https://github.com/dexhorthy/mailcrew
congrats on the launch dex! this is a problem that i've already seen come up a dozen times and many companies are building it internally in a variety of different ways. easier to buy vs. build for something like this imo, glad its being built!
Congrats on the launch! Big fan of what you guys are doing.
There is definitely a need for this.
What I don't understand from quickly skimming your description and homepage: Do you source/provide the humans in the loop? That's a good value add, but how do I automatically / manually vet how you do the routing?
great comment - today we don't provide the humans, i think there's two angles here
- providing the humans can be super valuable, especially for low-context tasks like basic labeling
- depending on the task, using internal SMEs might yield better results (e.g. tuning/phrasing a drafted sales email)
Is that possible to connect it to an existing website chat widget apps like tawk?
Also, caught a few typos on the website: https://triplechecker.com/s/992809/humanlayer.dev
How does it compare with the built-in human-in-the-loop feature from langgraph? Or CrewAI allows humaninput as well right?
great question - yeah i was actually heavily inspired by people trying to figure that stuff out on reddit back in july, and realizing that mapping that human input across slack, email, sms was never going to be a core focus for those agent frameworks
Congrats! Looking forward to getting HumanLayer integrated into our stuff
So many uses for this. Excited to see how it develops.
thanks! What's your favorite potential use case.
I work in operations/finance. I've experimented with integrating LLMs into my workflow. I would not feel comfortable 'handing the wheel' to an LLM to make actions autonomously. Something like this to be able to approve actions in batches, or approve anything external facing would be useful.
Congrats on the launch! Just commenting to wish you guys good luck
Hey Dex! Congrats on the launch - excited to see the response here :)
Congrats on the launch Dex! A long way from the Metalytics days.
Can’t wait to try this out.
Great work there!!
Loving you guys have Typescript support from day one!
hah thanks dude! I am very bullish on TS as the long term thing, Not to turn this into a language vs language thread but I spend a lot of time thinking about why ppl struggle so much with python...so far I came up with
concurrency abstractions keep changing (still transitioning / straddling sync+threads vs. asyncio) - this makes performance eng really hard
package management somehow less mature than JS - pip been around way longer than npm but JS got yarn/lockfiles before python got poetry
the types are fake (also true of typescript, I think this one is a wash)
the types are fake and newer. typing+pydantic is kinda bulky vs. TS having really strong native language support (even if only at compile time)
virtual environments!?! cmon how have we not solved this yet
wtf is a miniconda
VSCode has incredible TS support out of the box, python is via a community plugin, and not as many language server features
100%. I build edgechains (https://github.com/arakoodev/EdgeChains/) and a super JS/TS maxi for genai applications.
Congrats on the launch!!
This is the first new YC launch I've seen involving AI that I am extremely positive about. I have worked with systems implementing similar functionality ad-hoc already, but seeing it as a buy-in service - and one so easy to integrate - is really cool.
From what I've seen, this will bring the implementation needs for this kind of functionality down from "engineering team" to a single programmer.
Looks super interesting
thank you for checking it out! what sorts of experiences have you had with agents so far?
Looks amazing! (Also, I've known Dexter since before Human Layer and he's a force of nature. If you think this is interesting now, you're going to be amazed at where it goes)
thank you! stoked for what's coming
Congrats on the launch Dex!
Super useful
Just an idea: having a little widget in the MacOS menu bar that pops up or sends you a notification to solve a human task wouldn't be so terrible either.
Let's go Dex, congrats on the launch!
thanks dude!
This seems generic enough that it could almost be applied to any use case. Have you considered catpcha as a use case?
If you're talking about CAPTCHA solving as a service, that already exists, and the cost is measured in mere dollars per thousand CAPTCHAs solved.
that's a great idea - I put together one example for getting an MFA code for a website, but the captcha thing "pull a human into a web session" is something I've wanted to play with for a while
I think at some point, the term API should be replaced with another acronym to emphasize humans as the focal point.
Congrats on the launch! Definitely a needed product. BTW, your docs link is broken, but working docs link is here: https://www.humanlayer.dev/docs/introduction
thank you! I updated the post and it should be fixed now!
Docs link is broken; https://www.humanlayer.dev/docs
oh wow! thank you! fixing!
Hold up, is that illustrious Sprout Social alumni Dex Horthy? If you and Ravi are in SF we should catch up after the holidays.
Hiring humans to do a consistent job is gonna be a nightmare and a limit on the scalability of the service. How are you defining your service level agreements?
They aren't providing the humans. Just the tools for integrating human input/oversight.