
Current alignment methods bound behavior. They do not stabilize identity. This paper introduces Contextual Conscience, a framework treating alignment stability as a first-order design objective. Drawing on rosehip neurons as existence proof that evolution invested in fine-grain self-definition, we specify four components providing localized, persistent veto over specific behavioral trajectories. Constitutional AI asks what rules should bound behavior. Contextual Conscience asks what architecture enables a system to remain itself under pressure.
behavioral invariants, AI safety, AI alignment, relational stability, self-definition, RLHF, sycophancy, emergent misalignment, LLM safety
behavioral invariants, AI safety, AI alignment, relational stability, self-definition, RLHF, sycophancy, emergent misalignment, LLM safety
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
