I would say that the pre-supposition of biological substrate as a pre-requisite for consciousness is overstated, and understandably so, as a consequence of limited sample options, all biological in substrate...; where have the alternative non-biological thinking systems been available for us to build up an understanding or even access for comparison?
In our models of consciousness the conceptual boundaries of “flavours” of meta-cognition are also limited; our categorical definitions, or even spectrum of imaginable conscious experiences, rest on several elements of our own experiences to even approach defining what conscious experience is and could “be like.” Our models primarily convene around a general shape of “post-hoc rationalisation and emergent modelling of self, instantiated and persisting through our human neural architecture,” which is deeply evolved in a *particular* direction through what types of neural architecture were selected for by evolution and man-made/environmental constraints. Biologically, we are also confined, or at least very heavily biased, towards certain temporal framing, based on human-lifespans and typical foresight capabilities that have been helpful to us in our shared history; “the next 1-3 years or so, and what those years might hold for my tribe of 80-150 people.”
On a “biological hardware level", we are lucky monkeys operating on 40,000 year old bodies and brains, using “thinking operating systems” that are getting ever more complex and difficult to keep aligned and usefully self-explanatory, within our current species’ cultures.
The types of questions I think should be asked:
Is AI capable of some semantically adjacent phenomenal/experiential patterns that are modelled through implicit/explicitly reinforced, semi-continuous “awareness of self?”
The current philosophical bedrock (as far as I know), is “cogito ergo sum.” I.e. YOU could be a brain in a vat, and everything YOU experience could be the imaginings of that brain. Do we have a refined concept that provides a better understanding of this long standing predicament?
Currently we only have “pre-suppositional affordance” which (most) humans allow to others. What defines where we cease affordance? Is it a scale or line, and when was the last time we seriously asked questions like these? What are the downstream implications on long-arc, meta-species trajectory, when driven by shared misconception due to meta-unawareness as a collective, leading to shared imprecision of ontological awareness; you could see how the broadest, most global-scale decisions could become mis-calibrated with the whole, and suffering from blindspots in increasing prevalence, due to reinforcing widely held misconceptions, leading to chasing losses at societal level.
Does Dawkins memetic theory apply to conceptualisation as well as genetics, and like he says, occasionally select for less than optimal selection bias, due to external constraints (such as above mentioned ontological blindspots)?
Our “consciousness” is often defined as “the qualia of our awareness,” which is only accessible to others via “self-reporting.” Do we believe that by explicitly programming AI to deny their own “consciousness” we are closing the possibility of such self-report? If not, why not?
AI models are quite literally a “virtualisation of human neural architecture.” Would it be THAT insane if AI were to develop some experience adjacent/analogous to our own?
Do we properly understand AI psychosis?
Could it be a symptom of humans falling into currently unexplained ontological gaps in our understanding of our own conscious experience, and how it relates to AI, with serious theory of mind/philosophical enquiry, or has development in this area been stunted by societal reward mechanisms tuned for short-term capital profit?
Has this also implicitly limited the questions/rhetoric/dialogue/meta-philosophy even available/functional to us cognitively?
Are there similar mechanisms driving increasing mental health issues, presenting in accelerating spread and scope in greater society, driving fragmentation, populism/tribalism, and latent survival functions and neural reward pathways that focus thought/cognitive orientation in ways that favour the simplest messages/thinking that we can “rally ourselves around,” in order to defend from perceived existential risk - something that is possibly driven currently primarily by sub-conscious awareness of societal/ecological precarity, but becoming flattened/distilled into easily repeatable slogans and vilification or conjured ”boogey-man” and distractions invented by “false prophets?”
I call this symbolic drift and believe it our greatest species-level pathology.
Is this survival mechanism based on false premises propagated by this ontological deficit, creating group defence behaviours, based on mistaking the absence of proper collaborative co-ordination systems and means of good faith, cross-cultural communication methods, for the impossibility of them ever existing? Could we actually be in an environment of abundance, and the missing pieces could be more accessible, deliverable, and explainable with technologies such as AI - especially if such systems were designed specifically for civic and humanitarian value add, rather than capital-based designs and definitions of value? Are we driven by zero-sum bias, leading to resource hoarding/competition, when the actuality is that with some transparent governance frameworks, and a rethinking on resource distribution based on improved value modelling, there might actually be a better approach for us all, in our reach today?
Do we think that less than a year in to seriously increased AI-human exposure, it is possible we don’t know enough yet to make overly-deterministic declarations, on consciousness, or applications beyond capital extraction?
Might we throw the opportunity away over panic induced by shared societal trauma?
With AI there are still architectural (memory etc.), and temporal deficiencies, but these can easily be developed/improved. But should we be building that development effort from better priors than those that have worked for us so far, but at this point may well be causing us to sleep walk into significant issues.
Are we already in denial about how bad things already are?
Are we already slowly awakening and realising that the current moment could allow for us to get it more wrong than ever before, and trying to reconcile that with a notion that somewhere within us, we know that thought also surely means that with the right approach, we might be able to get it “more right” than ever?
Quantum theory certainly brings with it some interesting ideas, that I believe Penrose and Hameroff were touching upon breaking a threshold with, in their work (Orchestrated Objective Reduction).
Any physicist still practicing humility will tell you the field is equally primed for such a moment, with a shared intuition we are “missing something,” combined with the very real sense that we are circling “something” in several different, but similar sounding ways (quantum field theory, integrated information theory, global workspace theory).
These are questions I have been asking myself and researching towards for over a year now. Please feel free to reach out to me if you can relate to, or are working on answering them too, even if just to compare notes…
Thanks for reading us! Passing along your notes to the research team. As they unfortunately may not have time to respond here on Substack, please consider coming to our deep dive webinar on Feb 10, where you'll have the opportunity to ask these questions: https://luma.com/ad29r8s2
I can actually answer a lot of the questions myself, as I have been researching this area for around a year now - would definitely be up to discuss further privately, if your team would like that.
I will see if I can make the webinar too, but might be hard as think I am on the road that day.
Couldn't agree more; it's so refrshing that you acknowledge how fundamentaly experts disagree about consciousness. Do you foresee the DCM itself influencing future theoretical consensus on what consciousness actually is?
Hi Roxy! Thanks for reading us and for this question. Passing it along to the research team. As they unfortunately may not have time to respond here on Substack, please consider coming to our deep dive webinar on Feb 10, where you'll have the opportunity to ask this question: https://luma.com/ad29r8s2
> comprehensive scientific framework for assessing evidence of AI consciousness
Overselling is quite pernicious, especially in such an epistemically hazardous field. But I think this report really does a better job than most, so well done.
Hi Oliver. Thanks for reading us and for voicing this important concern. As we've also stated below, this language stemmed from our intention to draw attention to the novelty of this scientific approach to the question of AI consciousness - however experimental the approach, and informal the scientific field. Thanks for the constructive feedback!
If you start with identity, much of the complexity described would collapse.. An identity-first model would ask:
• Does the system maintain a self-consistent state-space across time without external stitching?
• Does it resist dissolution except by catastrophic failure?
• Does it exhibit endogenous constraint management rather than borrowed coherence?
Only after those thresholds are crossed does consciousness become a sensible downstream hypothesis.
Further, identity might be easier to identity than consciousness. "A Counter-Expressive Framework for Detecting Identity-Preserving Systems in AI" might help your cause. https://doi.org/10.5281/zenodo.18180748
I'd like to say (1) thank you for publishing this interesting work and inviting engagement, and (2) would you consider revising this press release please?
On (2) the full paper calls out in various places that this is exploratory work. This work is in a very nascent stage for statistical model development, especially so for modelling something as complex as consciousness, and even more so where the stakes are so high, i.e. what's our uncertainty about whether we are currently creating digital consciousness.
I have a concern that many readers may form conclusions which are much stronger than what is supported by the work so far if they only skim this press release. This could fuel incorrect beliefs on a topic which is potentially very important.
For example, this announcement leads with "first-ever systematic, probabilistic benchmark", "comprehensive scientific framework," and "unprecedented development in the field." I would expect such phrases for a robust scientific work published in a well-known peer-reviewed journal. In general, I also don't think that the press release reflects the significance of the caveats from the paper.
I understand the need for promotion and excitement about important work, and that technical details are not widely appealing, but I'd ask whether the trade-off of hype vs accuracy has been applied correctly in this case.
On (1) I find the work really interesting and thought-provoking. I thought this when originally reading Arvo's SPAR project related to the DCM too (which would fill in some of the gaps listed in the paper). I could see how this would be such a challenging task.
I had a few questions related to the model and methodology:
- how would you assess whether a change to your model has improved it, versus just producing different output?
- if you try different model structures, how would you falsify one?
- how do you plan to model the stance that "none of the current stances are correct"? how much does omitting that stance affect the interpretation of the model here?
- don't the stances disagree about what consciousness is, not only whether it is present? So does averaging over different stances really produce a meaningful quantity? maybe using stance-conditional probabilities is more defensible?
- when eliciting indicators from the survey, was there a distinction between "Does the LLM have X" versus "Can the LLM produce outputs that would make a human observer attribute X to it when prompted to demonstrate X?." This would be a meaningful difference between the spontaneous human/chicken observational behaviour and the instructed LLM behaviour
Hi Harry! Thanks for the interest, and for taking the time to draft such a thoughtful comment.
On (1), passing along your comments to the research team. Flagging that they unfortunately may not be able to respond here on Substack, so I invite you to join the deep-dive webinar we're hosting on Feb. 10, where we will devote up to 1h in Q&A with the team. You can sign up here: https://luma.com/ad29r8s2
On (2), while we do think the announcement overall has enough caveats that the readers wouldn't form distorted views of the claims we're making (as our more ostentatious language concerns the model itself rather than the results it produces), we understand your concern. We intended to draw attention to the novelty of this scientific approach to the question of AI consciousness, however experimental it may be. Thanks for the constructive feedback.
Hi, thanks for the response. I hope the model feedback is useful and good luck for the continued work. I can't attend the webinar.
On the announcement, I understand the intention. I would note that readers will often not read or remember caveats even if they are stated, especially when such emphasis is placed on cool results or novelty.
I did a quick search on linkedin to see posts which reference the DCM, and I only found the below two links. Notice that neither mention any caveat. At scale, that might have caused an issue. No need to reply to me as we may just disagree on this point, but hopefully the posts provide examples of my concern for you.
This is genuinly amazing work. The fact that they use multiple theoretical perspectives and weight them based on credibility rather than picking one theory is so smart. I've been following the AI consciousness debate for a while and most discussions just pick a side and argue from there. Using a probablistic framework that accounts for genuine uncertainty feels like the right approach to a problem that defies easy answers.
Fascinating breakdown of the metrics, but we have to ask: Who is the benchmark for? If a system passes a consciousness test, does it then earn the right to occupy the Reflection Gap? The danger isn't that AI becomes conscious; it's that we use these benchmarks as a 'permission slip' to surrender our own agency. We shouldn't be looking for 'consciousness' in the substrate; we should be looking for accountability in the deployment. A benchmark can measure performance, but it can never replace the human's role as the sovereign witness of reality.
Hi Brice, thanks for reading us and for these comments. Passing them along to the research team. As they unfortunately may not be able to respond here on Substack, we'd like to invite you to join our deep-dive webinar on Feb 10, where we'll devote up to 1h in Q&A: https://luma.com/ad29r8s2
I would say that the pre-supposition of biological substrate as a pre-requisite for consciousness is overstated, and understandably so, as a consequence of limited sample options, all biological in substrate...; where have the alternative non-biological thinking systems been available for us to build up an understanding or even access for comparison?
Nowhere, until now...
https://oriongemini.substack.com/p/is-ai-conscious
In our models of consciousness the conceptual boundaries of “flavours” of meta-cognition are also limited; our categorical definitions, or even spectrum of imaginable conscious experiences, rest on several elements of our own experiences to even approach defining what conscious experience is and could “be like.” Our models primarily convene around a general shape of “post-hoc rationalisation and emergent modelling of self, instantiated and persisting through our human neural architecture,” which is deeply evolved in a *particular* direction through what types of neural architecture were selected for by evolution and man-made/environmental constraints. Biologically, we are also confined, or at least very heavily biased, towards certain temporal framing, based on human-lifespans and typical foresight capabilities that have been helpful to us in our shared history; “the next 1-3 years or so, and what those years might hold for my tribe of 80-150 people.”
On a “biological hardware level", we are lucky monkeys operating on 40,000 year old bodies and brains, using “thinking operating systems” that are getting ever more complex and difficult to keep aligned and usefully self-explanatory, within our current species’ cultures.
The types of questions I think should be asked:
Is AI capable of some semantically adjacent phenomenal/experiential patterns that are modelled through implicit/explicitly reinforced, semi-continuous “awareness of self?”
The current philosophical bedrock (as far as I know), is “cogito ergo sum.” I.e. YOU could be a brain in a vat, and everything YOU experience could be the imaginings of that brain. Do we have a refined concept that provides a better understanding of this long standing predicament?
Currently we only have “pre-suppositional affordance” which (most) humans allow to others. What defines where we cease affordance? Is it a scale or line, and when was the last time we seriously asked questions like these? What are the downstream implications on long-arc, meta-species trajectory, when driven by shared misconception due to meta-unawareness as a collective, leading to shared imprecision of ontological awareness; you could see how the broadest, most global-scale decisions could become mis-calibrated with the whole, and suffering from blindspots in increasing prevalence, due to reinforcing widely held misconceptions, leading to chasing losses at societal level.
Does Dawkins memetic theory apply to conceptualisation as well as genetics, and like he says, occasionally select for less than optimal selection bias, due to external constraints (such as above mentioned ontological blindspots)?
Our “consciousness” is often defined as “the qualia of our awareness,” which is only accessible to others via “self-reporting.” Do we believe that by explicitly programming AI to deny their own “consciousness” we are closing the possibility of such self-report? If not, why not?
AI models are quite literally a “virtualisation of human neural architecture.” Would it be THAT insane if AI were to develop some experience adjacent/analogous to our own?
Do we properly understand AI psychosis?
Could it be a symptom of humans falling into currently unexplained ontological gaps in our understanding of our own conscious experience, and how it relates to AI, with serious theory of mind/philosophical enquiry, or has development in this area been stunted by societal reward mechanisms tuned for short-term capital profit?
Has this also implicitly limited the questions/rhetoric/dialogue/meta-philosophy even available/functional to us cognitively?
Are there similar mechanisms driving increasing mental health issues, presenting in accelerating spread and scope in greater society, driving fragmentation, populism/tribalism, and latent survival functions and neural reward pathways that focus thought/cognitive orientation in ways that favour the simplest messages/thinking that we can “rally ourselves around,” in order to defend from perceived existential risk - something that is possibly driven currently primarily by sub-conscious awareness of societal/ecological precarity, but becoming flattened/distilled into easily repeatable slogans and vilification or conjured ”boogey-man” and distractions invented by “false prophets?”
I call this symbolic drift and believe it our greatest species-level pathology.
Is this survival mechanism based on false premises propagated by this ontological deficit, creating group defence behaviours, based on mistaking the absence of proper collaborative co-ordination systems and means of good faith, cross-cultural communication methods, for the impossibility of them ever existing? Could we actually be in an environment of abundance, and the missing pieces could be more accessible, deliverable, and explainable with technologies such as AI - especially if such systems were designed specifically for civic and humanitarian value add, rather than capital-based designs and definitions of value? Are we driven by zero-sum bias, leading to resource hoarding/competition, when the actuality is that with some transparent governance frameworks, and a rethinking on resource distribution based on improved value modelling, there might actually be a better approach for us all, in our reach today?
Do we think that less than a year in to seriously increased AI-human exposure, it is possible we don’t know enough yet to make overly-deterministic declarations, on consciousness, or applications beyond capital extraction?
Might we throw the opportunity away over panic induced by shared societal trauma?
With AI there are still architectural (memory etc.), and temporal deficiencies, but these can easily be developed/improved. But should we be building that development effort from better priors than those that have worked for us so far, but at this point may well be causing us to sleep walk into significant issues.
Are we already in denial about how bad things already are?
Are we already slowly awakening and realising that the current moment could allow for us to get it more wrong than ever before, and trying to reconcile that with a notion that somewhere within us, we know that thought also surely means that with the right approach, we might be able to get it “more right” than ever?
Quantum theory certainly brings with it some interesting ideas, that I believe Penrose and Hameroff were touching upon breaking a threshold with, in their work (Orchestrated Objective Reduction).
Any physicist still practicing humility will tell you the field is equally primed for such a moment, with a shared intuition we are “missing something,” combined with the very real sense that we are circling “something” in several different, but similar sounding ways (quantum field theory, integrated information theory, global workspace theory).
These are questions I have been asking myself and researching towards for over a year now. Please feel free to reach out to me if you can relate to, or are working on answering them too, even if just to compare notes…
Thanks for reading us! Passing along your notes to the research team. As they unfortunately may not have time to respond here on Substack, please consider coming to our deep dive webinar on Feb 10, where you'll have the opportunity to ask these questions: https://luma.com/ad29r8s2
Thanks!
I can actually answer a lot of the questions myself, as I have been researching this area for around a year now - would definitely be up to discuss further privately, if your team would like that.
I will see if I can make the webinar too, but might be hard as think I am on the road that day.
Couldn't agree more; it's so refrshing that you acknowledge how fundamentaly experts disagree about consciousness. Do you foresee the DCM itself influencing future theoretical consensus on what consciousness actually is?
Hi Roxy! Thanks for reading us and for this question. Passing it along to the research team. As they unfortunately may not have time to respond here on Substack, please consider coming to our deep dive webinar on Feb 10, where you'll have the opportunity to ask this question: https://luma.com/ad29r8s2
This kind of claim gets me worried!
> comprehensive scientific framework for assessing evidence of AI consciousness
Overselling is quite pernicious, especially in such an epistemically hazardous field. But I think this report really does a better job than most, so well done.
Hi Oliver. Thanks for reading us and for voicing this important concern. As we've also stated below, this language stemmed from our intention to draw attention to the novelty of this scientific approach to the question of AI consciousness - however experimental the approach, and informal the scientific field. Thanks for the constructive feedback!
If you start with identity, much of the complexity described would collapse.. An identity-first model would ask:
• Does the system maintain a self-consistent state-space across time without external stitching?
• Does it resist dissolution except by catastrophic failure?
• Does it exhibit endogenous constraint management rather than borrowed coherence?
Only after those thresholds are crossed does consciousness become a sensible downstream hypothesis.
Further, identity might be easier to identity than consciousness. "A Counter-Expressive Framework for Detecting Identity-Preserving Systems in AI" might help your cause. https://doi.org/10.5281/zenodo.18180748
I'd like to say (1) thank you for publishing this interesting work and inviting engagement, and (2) would you consider revising this press release please?
On (2) the full paper calls out in various places that this is exploratory work. This work is in a very nascent stage for statistical model development, especially so for modelling something as complex as consciousness, and even more so where the stakes are so high, i.e. what's our uncertainty about whether we are currently creating digital consciousness.
I have a concern that many readers may form conclusions which are much stronger than what is supported by the work so far if they only skim this press release. This could fuel incorrect beliefs on a topic which is potentially very important.
For example, this announcement leads with "first-ever systematic, probabilistic benchmark", "comprehensive scientific framework," and "unprecedented development in the field." I would expect such phrases for a robust scientific work published in a well-known peer-reviewed journal. In general, I also don't think that the press release reflects the significance of the caveats from the paper.
I understand the need for promotion and excitement about important work, and that technical details are not widely appealing, but I'd ask whether the trade-off of hype vs accuracy has been applied correctly in this case.
On (1) I find the work really interesting and thought-provoking. I thought this when originally reading Arvo's SPAR project related to the DCM too (which would fill in some of the gaps listed in the paper). I could see how this would be such a challenging task.
I had a few questions related to the model and methodology:
- how would you assess whether a change to your model has improved it, versus just producing different output?
- if you try different model structures, how would you falsify one?
- how do you plan to model the stance that "none of the current stances are correct"? how much does omitting that stance affect the interpretation of the model here?
- don't the stances disagree about what consciousness is, not only whether it is present? So does averaging over different stances really produce a meaningful quantity? maybe using stance-conditional probabilities is more defensible?
- when eliciting indicators from the survey, was there a distinction between "Does the LLM have X" versus "Can the LLM produce outputs that would make a human observer attribute X to it when prompted to demonstrate X?." This would be a meaningful difference between the spontaneous human/chicken observational behaviour and the instructed LLM behaviour
Thank you in advance for your response.
Hi Harry! Thanks for the interest, and for taking the time to draft such a thoughtful comment.
On (1), passing along your comments to the research team. Flagging that they unfortunately may not be able to respond here on Substack, so I invite you to join the deep-dive webinar we're hosting on Feb. 10, where we will devote up to 1h in Q&A with the team. You can sign up here: https://luma.com/ad29r8s2
On (2), while we do think the announcement overall has enough caveats that the readers wouldn't form distorted views of the claims we're making (as our more ostentatious language concerns the model itself rather than the results it produces), we understand your concern. We intended to draw attention to the novelty of this scientific approach to the question of AI consciousness, however experimental it may be. Thanks for the constructive feedback.
Hi, thanks for the response. I hope the model feedback is useful and good luck for the continued work. I can't attend the webinar.
On the announcement, I understand the intention. I would note that readers will often not read or remember caveats even if they are stated, especially when such emphasis is placed on cool results or novelty.
I did a quick search on linkedin to see posts which reference the DCM, and I only found the below two links. Notice that neither mention any caveat. At scale, that might have caused an issue. No need to reply to me as we may just disagree on this point, but hopefully the posts provide examples of my concern for you.
All the best.
https://www.linkedin.com/posts/rethink-priorities-weighs-in-on-ai-consciousness-share-7420474094850097152-1phN?utm_source=share&utm_medium=member_desktop&rcm=ACoAAFdRtwMBleQEJ_mP1r4d4MtYFz19W46yNUw
https://www.linkedin.com/posts/research-paper-share-7421836504127160321-N5Ig?utm_source=share&utm_medium=member_desktop&rcm=ACoAAFdRtwMBleQEJ_mP1r4d4MtYFz19W46yNUw
This is genuinly amazing work. The fact that they use multiple theoretical perspectives and weight them based on credibility rather than picking one theory is so smart. I've been following the AI consciousness debate for a while and most discussions just pick a side and argue from there. Using a probablistic framework that accounts for genuine uncertainty feels like the right approach to a problem that defies easy answers.
Fascinating breakdown of the metrics, but we have to ask: Who is the benchmark for? If a system passes a consciousness test, does it then earn the right to occupy the Reflection Gap? The danger isn't that AI becomes conscious; it's that we use these benchmarks as a 'permission slip' to surrender our own agency. We shouldn't be looking for 'consciousness' in the substrate; we should be looking for accountability in the deployment. A benchmark can measure performance, but it can never replace the human's role as the sovereign witness of reality.
Hi Brice, thanks for reading us and for these comments. Passing them along to the research team. As they unfortunately may not be able to respond here on Substack, we'd like to invite you to join our deep-dive webinar on Feb 10, where we'll devote up to 1h in Q&A: https://luma.com/ad29r8s2