CatalaLang/catala: Programming language for law specification

CatalaLang/catala: Programming language for law specification




"We" will interpret the law for you and the judges, and "we" are not suspicious at all of having a hidden agenda to replace "the law" by "how we see the law" to benefit ourselves.

Is this a joke?


This project seems to implicitly assume that a formally specified code of laws, where statutes can be interpreted largely mechanistically, is a good thing (and by extension, that the existing system of human interpreters with discretion and margins of error is a problem to be overcome).

I don't disagree with this assumption outright, but it's certainly not obvious to me that it is correct, and the authors appear to present no arguments supporting the same.


This seems obvious to me and seems something that law already strives to: you end up with very precise law regardless via case law, but with a high legal cost to reach that point. At the very least, it's not something I'd expect to be on Catala's homepage.

I think this is closely related to rule of law too. Per

> In particular, laws should be open and clear, general in form, universal in application, and knowable to all.... The law should... comprise determinate requirements that people can consult before acting, and legal obligations should not be retroactively established.

What are the benefits of ambiguous laws?


> What are the benefits of ambiguous laws?

That they can deal with the massive amounts of ambiguity & blurred lines / grey areas the world has to offer. Too many of our everyday concepts can’t be expressed rigorously for a formal encoding & determinate decision process - a famous example being obscenity’s famed “I know it when I see it” definition - and many are highly context dependent.


"What are the benefits of ambiguous laws? "

Freedom and flexibility. Reality is infinitely complex, trying to encode real world problems 100% accurate into code can only work by rounding a lot, with the result of oversimplification with the result of lots of "unjust" rulings, when you do not take the intent of the law into account, but only the words. We have this problem already (a lot) many things that seem very wrong but are legal by the letters of the law.

(Mobsters being released because of formalities for example, even though everyone knee they were guilty)


But that doesn't mean freedom or flexibility for citizens. It means freedom and flexibility for the judiciary. For citizens, it takes away their freedom and flexibility because vague or contradictory laws (often coupled with severe penalties) create risk, or the perception of risk, where there wasn't meant to be any to begin with. It causes people to stop trying new things out of fear that Rule By Law will happen to them.

Whose freedom is more important? The people's, or their rulers? If you believe in enlightened rulers, you may legitimately conclude the latter, but for the rest of us, we'd rather the government be properly constrained. After all that's why laws are written down to begin with, it's why there is such a thing as the professional lawmaker.

Vague law is usually not written due to some deep philosophy of law making, after all. Nobody really tries to defend it in practice. It's almost always a result of lazy or politically contentious lawmaking when the people writing the laws don't really know what they want in the first place, so trying to divine their intent will never get you far.


"But that doesn't mean freedom or flexibility for citizens. "

It does, because if laws want to cover every case, they need to be so verbatim and rule out so many possibilities, that in the end, every action is encoded in law, which will result in people only doing things that are explicitely allowed. That limits a lot.

Simple example, despite being an adult, I like to climb trees. And sometimes also in parks. So I had discussions with officials or or wannabe officials about: is this allowed? It turns out, there is no rule allowing it. But there is also usually no rule forbidding it. But most people default to, if it is not explicitely allowed, it must be forbidden (which is a very german thing, but is not unique to us).

Michael de Montagne, a old french philosopher who is also quoted around this thread, put it way better in words. I try to find the passage.


I'd say, it's useful to distinguish between intentional and unintentional ambiguity/underspecification. For example, many jurisdictions that have statutory contract law have a statute that says that a provision in a contract is unenforceable if it is "unconscionable". There is, of course, a broad spectrum of things that can fall under that term, but that's intentional, i.e. the law maker specifically wanted to open a door there for a judge to hold a contractual provision unenforceable if they need it to be to get to a just outcome.

However, a lot of ambiguity, underspecification, and just plain sloppiness in language is there for no good reason and causes problems, and it would obviously be a good thing to get rid of that.


"However, a lot of ambiguity, underspecification, and just plain sloppiness in language is there for no good reason and causes problems, and it would obviously be a good thing to get rid of that."

100% agreed. When clear (and simple) rules are possible, vague rules just create opportunities for shady things. And yes, we have many, many laws that could be made simpler and therefore more clear. And if a programming language can help with that, that would be awesome.


Can ambiguity be quantified, and/or codified? Risking sounding like a painfully naive layman, would it be possible to define the range in which to apply the rule as function of range of inputs? When uncertaintu on inputs increases or veers outside of some well known range, then bandwidth for the judge increases?


I think something like this exists actually, even though I can't think of a concrete example, but I am sure it is not easy to implement (in a just way) either.


Laws very often look for intent of actions as well as the actions themselves.

And they are ambiguous: "beyond reasonable doubt" is what, as a percentage, in your view?


I suspect the motivation of this project is a lot more practical than what you’re projecting onto it, and I suspect you might be discounting how often law is transformed into code that runs on state IT systems.

As someone who is familiar with the process of turning statutes into code, I can appreciate what this project endeavors to do, even if the value is limited to providing clarification for software developers.


It is hugely useful. I worked on implementation of law in code for 10 years and the big problem is this: there's a constant communication between software devs and law poeple. For exmaple "did you code this like that ?", "why did the system did that ?", "if I code this way, does it reflect the law precisely ?". You have to constantly trnaslate from the code base to the language of lawyers and vice versa. Doing that with regular code is no good: regular code ends up mixing law code and technical code. Making a clear separation with a dedicated language is a big plus.

[deleted by user]

It fits quite will within French legal tradition. Remember that it was Montesquieu himself who said 'le juge n'est que la bouche de la loi'. When I was working on my law degree, I (being a programmer) was quite interested in this field, and (although my French wasn't/isn't great, so I didn't get to go as deep as I would have liked) I found a large community of such legalistic work in French speaking parts of the world. The professor at my university who was into this sort of stuff also seemed to lean towards that sphere of influence.


Some laws, like tax bracket computation probably are fine but most laws should not be codified like this in my opinion


The French have a programming language for tax related work:

[deleted by user]

I don't think there's much of a problem with actually reasoning about a law's text that a computer can help solve. The complicated bit is weighing equities, which still requires humans and lawyers.


Absolutely. Although the clarity by creating algorithms from tax tables can be helpful, and sometimes the wording seems ambiguous. Although you probably also need lots and lots of examples. (It is as if you need unit tests!)


Most tax-related rules in the US are specified in an XML-based business rules language. That's partly how tax prep companies are able to get rules that don't finalize until 12/31 into products that have to ship 6 weeks later.


I think you meant, "...humans, lawyers, and bribes."

No, I don't have a lot of faith in our legal system. Why do you ask?


Sure, but this new language wouldn't help with that.


+1. I see one benefit of this language -- it could make it much easier to write programs to compute taxes and benefits. Beyond that I don't see what it could possibly offer.

Are there any lawmakers, lawyers or judges excited about this, or is it only programmers?




I’m really interested in this project as another way to computationally express and use law. For more color on the project, here is a demo and discussion “Idea Flow” session I did with the project leads:


I'm actually in court later this month pertaining to some stolen traffic cones. Dismissed my lawyer just yesterday due to a dishonest swagger. I'll be preparing some documents using this after I prototype using a few frameworks. Will keep you posted.


Very cool. Pessimistically, I think that having a clear, understandable view of legal text so that people can navigate the law safely is against a lot of entrenched interests.

[deleted by user]

I don't think it's that. It's hard to write legal texts, and sometimes it's better to be vague, so the courts have some freedom when establishing jurisprudence.

Treaties can be written with vague wording to allow parties to sign it, even if there isn't 100% agreement. That's an old practice.


On the contrary, it's much better to have very clear text, otherwise it will turn against the citizen.

Imagine that you have an income tax where "income" isn't clearly defined. Someone will end up with an audit and a lawsuit from the tax office because their definition will be, of course, extensive (every income, including non-realized capital gains) whereas most citizens would only consider salaries.

In the end, you create legal uncertainty, and give courts way too much power.

For the record, I used to work for my country's government, and had to evaluate some laws in making that were written in an abstruse way. When I asked why, the civil servant told me that it was so "they could pick the most favorable meaning in the case of a lawsuit".


It's a balance. If you get too specific then your laws quickly become outdated as technology and society evolve. Or you miss corner cases by not enumerating every little scenario.

So there are different tiers to deal with this.

- Constitution - Very abstract and very rarely changed.

- Statute - Sometimes abstract, sometimes specific.

- Administrative rules. Very technical but still intended for broad application.

- Individual court cases. Can be hyper specific.


It is probably better to be intentionally and clearly unclear when you want to be, and clearly algorithmic when you want that, than just stir it all together in the name of judicial discretion.


I don't think the example given increases legibility.

Programming tends to use less understandable but more precise verbiage in general.


That's far too reductive. The law is abstract, and still needs to be interpreted and prudentially applied to specific situations, and you cannot capture all of that in the law a priori. Furthermore, having a computer crunch through a bunch of predicates is easy compared to getting the facts expressed in a form that is crunchable. So for specific and narrow applications where such representations are not costly to produce or already exist, such an application of computational law is feasible. But broadly? No.

And then there's the distinction between lex and ius that I think needs to be considered in this context.


I had so much trouble getting catala to even compile that I got frustrated. It’s a piece of academic code that’s very much abandoned


It's hardly abandoned. The git history shows activity almost every day over the last few months. It's clearly being actively developed.


This is interesting, and not to criticize, but I wonder if transformer model's accuracy in interpreting law will obviate the need for something like this.

It would be interesting to train Large Language Transformer Models to generate this code for you based on the text in the laws. This way you have a deterministic testable output, without risk of hallucinations.


Even LLMs would be better off with a clear unambiguous syntax to define human laws and rules.


Doesn’t TurboTax have some kind of DSL they use to encode all of the tax rules?


Thought this was realated to the Catalan western romance language. Naming could confuse some Spanish and French users.


Don't forget about the Andorran users!

[deleted by user]

programmers love to propose using "programming language" or similar for law

But this fails to realize that _ambiguity (in some ways) is a fundamental important part of law_.

This is because the world itself is fundamental ambiguous (in some ways)/clear cut.

Naturally not all ways of ambiguity are wanted.

But you can be sure that with "code as law" the ways loopholes are abused will get worse in my opinion.

I would even go as far that some many laws should be more focused on what should be upheld then the details how (which is fundamental less clear cut/more ambiguous).


Agreed. Although I don't think this is a bad idea, I think of the idea of perfectly defined laws and perfectly enforceable laws are terrifying. If every law on the books today were able to be perfectly enforced and perfectly monitored, our lives would be utterly miserable.

I'm not going to argue that's a problem with laws vs. enforcement, but either way, our society is built around ambiguity and unequal enforcement of law.


Enforcement and clarity are different concepts. We don't have a big problem with courts or police declining to enforce laws because they aren't precise enough.


Do you have any examples where ambiguity is truly beneficial?

For all the examples I can think of, the most beneficial outcome is removing the law altogether.


Two easy examples:

- Fair use law (there are thinks which clearly are and are not fair use but in between there is huge gray area where you can not really formulate generic precise rules which work reliable).

- Parent law (which has a lot of issues especially given how it's applied, but design wise you fundamentally have ambiguity about questions like when something is "enough added innovation" to be patentable as fundamentaly "the degree of inovativeness" is purely subjective.)

- Insult, is also very subjective. Define it as "only when insult was intented" would be bad, but similar would be "always when the person felt insulted" and even "if intended and felt insulted" has issues.

- Self defense it's in many jurisdiction based on the person feeling threatened, but in jurisdictions with sane law it also involves stuff like "a generic person would also have felt threatened", but then you still have to consider person specific circumstances.

- or lot of stuff around what counts as insider information, e.g. for insider trading


Speed limits is one example.


How so?

I would have thought that is a good example of a law placing an upper limit on risk in a very black and white way.

Maybe exemptions for passing on single lane roads might be a net reduction of risk but I can’t think of any other grey areas.


in many countries speed limits are quite clear cut

through "clear cut" here sometimes still involve an assessment of danger which fundamentally isn't 100% objective

and e.g. in germany you are only allowed to "drive as fast as it's save" even if the speed limit is higher (and that is a common occurrence, e.g. resident areas tend to be 30km/h zones but you have to slow down at nearly every crossing because anything else wouldn't be save and if you do an accident at a crossing driving 30km/h in such a zone you are very likely very much screwed (depending on damage done).

So I guess, yes speed limits in a certain way, too.


Thanks for the reply. Just a little spelling mistake, it’s “safe.” Besides that you write English better than me (a native) :P


I'm not speaking about perfect enforcement.

But about that law, in difference to what movies love to pretend, is not about clever word tricks and nit-picking formulations.

(In court it still can be about clever arguing, including nit picking arguments if necessary.)

But code _is_ about nit picking formulations at least if we ignore documentation, naming conventions etc. but lock solely at what the code does.

Code is meant to be precise.

Law is meant to be only as precise as necessary but no more then that. Or you could say it's meant to be as imprecise as viable.

Code is about the specific case (in general).

Law is about the generic case (in general), avoiding specific cases where possible.

Code is made for machines to consume.

Law is meant to be consumed with ambiguous defined context of the situations in (human) .

This is so deeply rooted in law that I would argue it's (in general, with exceptions) not possible to translate any current laws to code without accidentally changing their meaning in a lot of subtle but meaningful cases.


I don't think there is any reason ambiguity would clash with a project like this. Take "value.fair_market" in the concepts section [0]. Sure, lawyers can argue over what this means, but these competing definitions can also be defined programmatically.

I agree with your idea that our interest in laws shouldn't focus on implementation details but I think they should focus on outcomes. This requires a method to produce an evaluation function to measure the outcomes of a new law, and a system such as Catala to help model expected outcomes and to help select between competing laws (eg if our outcome = "we want less pollution" then our policy might be "ban polluting industries" or "tax pollution externalities." Both have complex consequences which would be better analyzed automatically and measured empirically.)



Something like this got passing mention in Greg Bear's [1] book Moving Mars (1993) [2], under the name Legal Logic. The (human) Martians used it with AI assistance to formulate legislation for their newly independent society.

For those who don't know, Greg Bear was a well-known SF author who died less than a year ago. His passing was discussed here at the time [3] [4].

He was one of the authors that influenced my youth a great deal, and I particularly remember this aspect of Moving Mars as catching my imagination, so will be interested to read what Catala has to offer.






Ambiguity is a feature, not a bug. I used to spend a lott of time on business process automation, and even in those more structured and restricted settings trying to codify procedures most often fails. The reason is that reality has (a) so many edge cases that it very rapidly devolves into chasing down an ever diminishing ROI, (b) is unknown by the middle management and business analists, those that would have the authority to construct and sign off on it, and (c) relies on intelligent people applying creative pragmatic solutions to keep the business running and straightjacketing those into inflexible automatons is the most surefire way to sink the ship.


My hunch is, in any sufficiently large rule set, there will be inconsistencies. Handwaily think Gödel, or just the need for bounded domains in DDD.

Humans (or, well, AI) is needed to cope with inconsistencies.

That said, pointing out the fact of existence of inconsistencies could be very valuable. But a system needs to embrace them, not fight them.


Gödel’s theorems don’t imply inconsistency for all large systems (unless “large” is taken to mean something strange), just for systems which are both not super-weak in what they can say, and complete (or if they have their own consistency as a theorem).

I don’t think Gödel’s theorems particularly support the claim you’re making.

In fact, here is an argument that a consistent rule-set (either can be extended to something consistent and complete, or ) can be extended to be made arbitrarily large and consistent:

take a ruleset which is consistent, but for which there is something for which it has no prescription one way or the other (neither explicitly nor implied collectively by other rules) (I.e. “not complete”). Then, add a rule specifying that thing and nothing else which isn’t implied by that thing. This will be consistent, as if it were not, then the negation of the rule added would have already been an implication.

This will either yield a larger ruleset of the same kind (consistent and incomplete), or it will yield one which is consistent and complete. Gödel’s theorems show that if the ruleset is an axiom system which is sufficiently expressive (e.g. contains Peano arithmetic) then the latter cannot be the result. So in this case, there are arbitrarily large extensions of the rule-set.

If it isn’t an axiom system, or is one for a rather weak system, then the “the result is a consistent and complete system” option, well, why would you want it to be larger?

Edit: perhaps what you are calling “inconsistencies” are what I would just call “exceptions”/“exceptional cases”?

To my mind, “embracing an inconsistency” doesn’t seem to make much sense in the case of law? Something has to be what actually happens. We (whether fortunately or unfortunately) cannot bring an actual contradiction into reality.

Well, I suppose if one takes a sub-truth(not sure if this is the right terminology? I mean the opposite of super-truth) approach to vague statements, one might say that a somewhat-bald man causes the statement “that man is bald, and also that man is not bald” to be true (and also false), and as such “bring a contradiction into reality”, but that’s not what I mean by the phrase.

I mean there is no full precise-ification of any statement, which we can cause to be simultaneously true and false irl.

Those acting as agents of the law must behave in some particular way.

When legal requirements contradict, people will not satisfy both of them. Perhaps one will be considered to take priority. Perhaps a compromise position between the requirements will be sought. Perhaps it will be left to the judgement of those following it in a case-by-case basis.

But in none of these cases is a contradiction implemented. Can they really be said to be embracing the contradiction?

Upon writing this edit I realize that I’m probably misinterpreting that part of your comment. I suppose the thing you are saying to embrace is not the individual contradictions themselves, so much as the system’s rules-as-written having contradictions, and therefore the necessity of dealing with such contradictions when implementing the rules, as the scenarios to which the contradictory statements apply, occur.


I think parent might have been referring to the inconsistency that Gödel noticed in the US Constitution when applying for citizenship.


This is less of a problem in legal systems as the legal system self admits to resting on unproven axioms.


The naming choice is really unfortunate. It's like naming a programming language "français", or "Deutsch" (or "English").

From the bottom of the readme:

> The language is named after Pierre Catala

I'd suggest changing it to PierreLang then.


"The language is named after Pierre Catala, a professor of law who pionneered the French legaltech by creating a computer database of law cases, Juris-Data."

As a native Catalan speaker I was quite surprised with the name! But it makes sense, since it's quite a common surname!


without formal training, one key thing I picked up is that most public understanding of legal concepts diverges from court understanding because law follows logical and/or gates

so “and” isnt a list of accepted criteria, it is a list of things that must be simultaneously satisfied

but its only using logical gates most of the time

this is a good step in showing that. not a panacea but a good step!


Laws are often written in most vague language to allow wide interpretation, especially laws regarding treason, communism, foreign agents, anti-war speeches and such things. Programming language won't help here.


I'm also somewhat skeptical. How would a program deduce that "cookies and similar technologies" will mean localStorage, sessionStorage, IndexedDB, etc. Remember that the law was written well before some of these technologies even existed.


Lawyer here.

Oh, this again. I suppose this looks relatively harmless, but I'm always wary of "law is like computer code."

The impulse to think this can strongly solve any real problem in the law is intuitively attractive, but I strongly predict this mostly never happens; it's the law's job to be intensely practical in the face of hard-edged "computer-like" rules.

If anything, you get goofy confusion about what things "are?" My go-to on this is always the "smart contract" -- which can be useful little bits of automated robot money moving code, but emphatically are neither "smart" nor "contracts."


I can assure you that the person who wrote this is very well aware of the subtleties involved with formalising the law. Law + Programming is an active research field (, and it is very far away from anything like smart contracts, it is full of brilliant people who have no pretension of replacing the law with computers, but simply be helpers where they can.


Sure, I think I posted my response not because "it could never solve ANY problems," but instead "way too many non-lawyers, especially techy-non-lawyers, have the deeply misplaced idea that is a very important, perhaps THE most important, problem to solve in the law." It's just not very high on the list at all.


> It's just not very high on the list at all.

Maybe not for lawyers, no. But as a citizen I'm expected to comply with the law, with many many laws. It'd actually be nice if law was slightly more formally verifiable, so it would be easier for me to understand what to comply with.

Being able to break down clauses into more logical normal forms would probably greatly enhance the possibility of compliance.


It really wouldn't though.

For every bit of gained clarity, you'd also gain a ton of people like the "sovereign whatever" idiots who just love trolling, with the added negative of them having more "formal proof" of their untenable silliness.


I agree. The idea of logically representing law isn't the same as replacing law with computers. That's why I even posted this.


This is a problem, but not one that Catala suffers from on a first reading.

Some elements of law are amenable to translation into source code, and indeed anyone working in fintech will probably have done that at some point. If the law gives a threshold for a tax allowance, for example, you need to encode that requirement in accordance with the law. Being able to mark up the text of each regulation should make it much easier to be confident you've not missed anything.

Trying to write non-financial regulation as code is pretty much doomed to failure. But to the extent that tax or benefits regulations set out numbers that we have to translate into code anyway, it's good to have that code be verifiable against the specific regulatory text.


Law is like computer code, if:

- your compiler was AI-complete and adversarial and hated you

- your compiler was also not bound by any hard rules and could emit undefined behavior at any time

- your job scheduling and orchestration system was AI-complete and adversarial and actively hated you

- your runtime library had 50 different incompatible canonical implementations and can only be run by being forked by publicly-elected officials who blindly merge patches from bad-faith lobbyists

- the documentation for any of those 50 runtime libraries is paywalled per page behind if you're lucky

- the IDE is Microsoft Word, and the linter is a summer associate on their tenth cup of coffee

- you will inevitably get a non-technical client who thinks that the more times you have "notwithstanding the foregoing" in your code the more you can call yourself Web Scale

[deleted by user]

> Oh, this again. I suppose this looks relatively harmless, but I'm always wary of "law is like computer code."

"To a man with a hammer, everything looks like a nail." (Twain?)

Hasn't Cyc impressively demonstrated just how incredibly difficult and costly it is to formalize even the most basic matters of daily life?

There already was a discussion two years ago:


> Hasn't Cyc impressively demonstrated just how incredibly difficult and costly it is to formalize even the most basic matters of daily life?

I would offer that the "cost/benefit" analysis for such a formalism exists on at least two axes: the concept domain which one is attempting to formalize, and the benefit (and/or size of consumers) of any such working system

I can wholly understand that trying to translate the entirety of English into a formal logic system sounds overwhelming. But to side with a sibling commenter, why not at least start with the tax code which is a personal pain point, has (presumably) a correct outcome for some cases, and is mostly algorithms-in-English

And then, for the consumer side: ok, if I snapped my fingers and Cyc existed and worked I struggle to think how exactly my life would change. If the formally-specified tax code existed and worked I wouldn't have to rage-upvote almost every comment on the annual tax hatred thread

I would even offer that an incomplete version could still be useful if one left "fuzzy" variables in the corpus, and said "welp, we can't define what a $Person is because of the hundreds of years of precedent, so you'll need an actual Judge for that". I don't meant to say that 50% of the corpus can be undefined variables, that's just silly, but I'd hope the tax code isn't built upon 50% undefined behavior, even if accountants want you to think it is


> why not at least start with the tax code

There have been many such attempts (e.g. NKRL by Zarri et al., also funded by EU). There are even societies that have been dealing with such issues for many decades (e.g. The formalization of law and language is only one of the issues. Like many previous attempts, this one suffers from the fuzziness of human language (even in the case of tax code). Fuzziness is not a drawback; it is what makes it possible to communicate efficiently in the first place. In order for us to communicate effectively, we need an enormous amount of tacit knowledge about our environment that our culture and life experience brings. If one tries to formalize the language, as in the present approach, one must also take this knowledge into account, down to the last detail (an "upper ontology" is by far not sufficient for this, and Cyc after decades is still not finished). And the tacit knowledge and also the moral valuation of the same change over time. And there are things like which stand in the way of a complete formalization. Lenat's 1990 book addressed many of the issues, but also his more recent talks are very informative where he demonstrates how they had to extend the Cyc representation language to cope with the problem, and why e.g. RDF triples are not enough.


how many times have you fought in court to argue about "the spirit of the law"? I, for one, don't really care about this lang or "law is computer code" thing. just wanted to know lawyers life, I guess.


I agree with you, but this is literally what those in civil law jurisdictions believe.

There are some areas where automating things can be effective – e.g. tax systems.


I like to turn this argument around to see how absurd it is.

Why not just take the existing law, and have a machine execute it in the style of a computer program?

We wouldn’t need judges juries or lawyers. You’d just type the specifics of your case and any supporting documents/evidence into the computer and a verdict would pop out.

Of course, the system could be used for other stuff too, like checking building code compliance or engineering soundness, signing off on military and police action, setting the executive branch’s priorities, and so on.


Not a lawyer. My thinking is very likely naive as I have no experience in this matter.

I see two potential issues:

- Picking evidences. "Evaluating" the law might need access to all the possible evidences that could exist, but that would certainly never be true, so you'd need someone to know which evidences to present. You probably cannot rely on some interactive process asking you such and such evidences because it would be presenting evidence that would trigger evaluations of chunks of laws. I would guess a lawyer with good knowledge of the law would probably be needed for this.

- Setting precedents. Wouldn't the "automated" law evaluation run into unprecedented cases all the time? You'd need someone to constantly issue a verdict on unforeseen situations all the time, and I guess you'd need a judge for this.

Maybe it could work on many "trivial" cases though.


Some people have argued that the fuzziness of the legal system can be a good feature for some reason, but you could always have a machine execute the law and a human make the final call. So you wouldn't need judges, juries, or lawyers, but you would need a team of legal shamans that sign off on verdicts


The problem isn’t checking the computer’s output.

It is that the law would need to encode all the stuff I said, so it would need to be nuanced enough to replace all engineering, leadership and administration roles. (And also anything involving ethics.)


> If anything, you get goofy confusion about what things "are?" My go-to on this is always the "smart contract" -- which can be useful little bits of automated robot money moving code, but emphatically are neither "smart" nor "contracts."

They are contracts—just not legal contracts. One of many types of contracts in the world that are not legal constructs.


Most contracts are intended to be enforceable by law, as far as I know. Which kind of non-enforceable contracts did you have in mind?


Social contracts? Personal contracts? God-sworn oaths? Covenants? The legal system is just a small corner of the universe.


Software-interface contracts generally aren’t (cf. Design by Contract). The parent is correct insofar as the use of the word “contract” is not limited to contracts as defined by law.


Right, I mean, I've heard this use and I think it's kind of silly.

It's like calling a hopefully-completed-circuit in some device an "electric contract" or something like that.


What's wrong with describing a hopefully-completed-circuit in some device as a contract?


It’s a contract (mutual agreement) between the client code and the implementation of the interface. The implementation fulfills certain obligations provided the client code fulfills certain other obligations. The interface defines what those obligations are. If the client code fails to meet its obligations, then the implementation of the interface isn’t bound to fulfill its obligations anymore either. The point is to think in terms of two parties, where the developer who is either using or implementing an interface takes on the role of one of the two contractual parties.


> They are contracts—just not legal contracts

I think this proves op’s point of goofy confusion for what things are.


Like, what language is? I'm not sure i get the point.


How is this different from Prolog?


Will it help?

On the one hand, I think it would be fantastic, if you had automated tests for the law. For example, when German politicians introduced the "hacker law", you could have pointed out that "This new law would break the 'security researchers need to be allowed to do penetration testing' test".

On the other hand, "Brexit is in conflict with the Good Friday Agreement, we need a solution for Nothern Ireland." was known without machine readable laws and test, but politicians ignored it anyway.

Maybe what's needed is a law that outlaws test-breaking laws and requires politicians to fix the tests first, but I bet that would just result in a lot of "commented" tests.


If you make a law against "test-breaking" people will just break that law.

[deleted by user]

(self plug) They published a paper describing the language, and here's a short video summary of it:


This seems like a worthy project but I think its landing page should define the scope of what it aims to do in more concrete terms. Otherwise it runs the risk of being seen as overambitious and open ended without a concrete problem to solve.

E.g. its not clear if there is an explicit or implicit ontology against which the validity of any codification can be checked.


Would be so funny if all the compute moves from AI to finding loopholes in the programmatically defined law.


"haha! actually, if you give it this input, the program crashes! i can finally steal pillows from the bookstore!"


This project is doing code -> text, right?

But then, the first line of description in Github says:

> Catala is a domain-specific language for deriving faithful-by-construction algorithms from legislative texts.

This reads like 'text -> code', which is the opposite of what this project seems to be doing.


Laws would do well to follow the rules of software. Small modules with clear responsibilities with an emphasis on readability and test cases that are run before you go to prod, for example. Testing is expensive so I understand why the legal system would rather just push their code and fix bugs when they see them in the wild. The collateral damage for people caught up in real life test cases is tolerable, especially when it’s someone else footing the bill.

Linting and type checking the existing codebase would also be more helpful than rewriting everything in a new language. Enforcing size constraints on vocabulary and word count. Cross referencing between different legal systems. Throwing out dead laws that are no longer executed in prod. Profiling the efficiency of existing laws to find hot spots.

There’s little incentive to do this when the current system is run by a cadre of highly trained legacy COBOL programmers. I’d pick a very small part of the system — incorporate a new city and start from the ground up — and take it from there with the clear eyed expectation that a full rewrite is going to take a century.


Moreover I think laws work similarly to software. People wrote law, others find a loophole and use it, the people fix it with patches, and so on...


Law is a mess, in part because its authors take shortcuts. For example, from the first example on CatalaLang's

> If the property was acquired by gift [and various conditions apply], then for the purpose of determining loss the basis shall be such fair market value. [emphasis added]

I think (and I'm not a lawyer or a tax expert) that this means that the basis of an asset can have a different value for the purpose of determining gain or determining loss. Wow, basis isn't just a number, although one might not notice this if one didn't read the six emphasized words.

But the Catala code seems to completely ignore this. Oops. I filed an issue:

In a real use case, I imagine that substantial refactoring of the parts that consume basis might be needed when one notices that the basis is not a number.


I was a bit confused by the fact that my first language is Catalan, which in Catalan is spelled Català. So yeah, imagine someone proposing a language specification for the law called English.




As someone who speaks Catalan, I find the name collision with the language referred to in the article most unfortunate.


Why unfortunate? Also, I love your username. I think many of us could have guessed that you spoke Catalan from that alone ;).


The Catalan language name is written Català.


I also came to comment that the software's name might cause confusion with the human language, Catalan.


It is named after a French guy called Pierre Catala, not the language. Probably the name comes from the language but according to the documentation does not seem to be with acute accent.

> The language is named after Pierre Catala, a professor of law who pionneered the French legaltech by creating a computer database of law cases, Juris-Data. The research group that he led in the late 1960s, the Centre d’études et de traitement de l’information juridique (CETIJ), has also influenced the creation by state conselor Lucien Mehl of the Centre de recherches et développement en informatique juridique (CENIJ), which eventually transformed into the entity managing the LegiFrance website, acting as the public service of legislative documentation.


Their research group is called 'Prosecco', and another of their projects at Inria is 'Squirrel'... Another one is F* (pronounced F star).

I imagine that they find their naming choices amusing.


The commenter was maybe pointing out a naming conflict.

[deleted by user]

> the documentation does not seem to be with acute accent.

'à' has a grave accent, not an acute accent.


Yups right, sorry :(


I just figured that since you cared to name the specific accent character (most people probably wouldn't!), you'd care about the mixup too. :)


Not related to the Catalan language but to Pierre Catala: (


This paper definitely has thoughts on the matter

"As the Rules as Code movement gains momentum, questions are starting to be asked about the performance and practical effects of expressing law computationally. This article examines the strengths, weaknesses, and new opportunities of engaging with these emerging systems."

The Future of Coding podcast covered it recently

The abstract says, "Software code is built on rules. The way it enforces them is analogous in certain ways to the philosophical notion of legalism, under which citizens are expected to follow legal rules without thinking too hard about their meaning or consequences. By analogy, the opacity, immutability, immediacy, pervasiveness, private production, and ‘ruleishness’ of code amplify its ‘legalistic’ nature far beyond what could ever be imposed in the legal domain, however, raising significant questions about its legitimacy as a regulator."

It's a complex paper/topic that I personally need more time to grasp before throwing my opinions around too heavily. But my first, knee-jerk reaction so far is that moving laws into code is a bad idea. Specifically, as the paper says, "...code by its very nature tends toward a kind of strong legalism. This is the case regardless of the intent of the programmer, however vicious or virtuous that may be."

The "strong legalism" inherent in code means "the sovereign’s exercise of power is de facto legitimate, and thus not open to question." Not to be reductive, but that ain't good.

I feel we've seen evidence of this path already, with (easily refuted, but somewhat common) claims like "data can't be biased" (for example). The tendency to blindly follow a computer's dictate with, "Well, the computer says this is so, so it must be so." is strong in our society at times, I think.


I think having a "linter" for laws can be beneficial. It can help producing laws that are easier to read and understand.

Having a "compiler" for laws can help identifying conflicts between different codes of law. e.g.: Imagine having a compiler error when a law is unconstitutional from a logical standpoint.

But verifying the "business logic" (e.g.: what is the spirit or intent of the law?) of the law will remain a human intelligence task.


Can we use it to compute taxes ?


What problem does this solve? It appears to add precision where it's mostly already clear - perhaps it can enforce some kind of rigor... but then like the example given uses "fair market value" as a term which I'd expect to be the kind of thing that's in contention, rather than any of the actual "logic", and it doesn't help with that.

The reason we have courts and lawyers is because of the need for interpretation beyond just writing good logic, so I don't see how this can really do anything. Or is it for something else?


I suppose a formalized legal language could:

- help in quickly testing whether newly drafted laws contradict existing laws(without needing to memorize the existing legal code)

- check for redundancies

- checking whether removing one law affects any others

- statistically analyze legal systems in different countries

Assuming any of those are important issues in law. I'm not sure


Catala still produces plaintext legal documents at the end of the day but can be seen as a markup language for those documents. But because that markup language is a whole lot more precise than the legal text itself, it can be a bit more versatile.

Examples of how this could be useful:

- Reducing the overhead for maintaining a list of semantic translations of that legal code into other languages. Of course the official language is the only one that is "legal" but the other translations should be close enough to effectively express the nuance provided the language outputs are maintained by people who can actually speak those languages.

- Producing machine executable proof or simulation code. This could be used for "fuzzing" the legal code to identify loopholes or unintended outcomes so that legislators can then propose improved terms to avoid those issues. This is by no means "making code law" but it provides an additional tool for understanding the law and how the many different parts of the legal code interact with each other.

- Adding on to the previous example, sim code could be integrated into complex models for simulating the impact of legal changes on the economy at large or specific segments.

- Finance related code can be used to generate a tool or API for validating tax, accounting, and compliance documents (as a first pass to catch errors early and reduce overhead) as well as to even prepare some of those documents. These tools often already exist but they are one or more steps removed from the actual legal definition which increases the risk of error as well as the overhead of maintaining them (which can potentially encourage rent seeking behavior by commercial providers of these tools).

France actually is already doing this to a reasonable degree albeit the "codified" version is based on the law rather than the codified version producing plaintext law. The DGFiP [1] maintains a gitlab organisation [2] that includes both Catala and MLang [3] representations of different parts of the french legal code for exactly these purposes.





> Of course the official language is the only one that is "legal"

If only! Here in Canada there are two official languages. All laws are drafted, and enacted, in both English and French. Both versions are equally valid, equally binding. And, sometimes, they don't say the same thing.


Catholic Canon Law is drafted and coded in Latin, which is supposedly the only official and binding version. However, this is translated into hundreds of vernaculars! People often read it in their native language for convenience, even lawyers, but this can be fraught with peril.


Yeah that's common in a number of other countries as well. I should have probably said "the official languages are the only ones that are legal" instead. In which case a tool like this could be useful for helping maintain that equivalence.


According to the example in the readme, it's specially for text law that produce codes… So it should be a road to some literal programming or implementation proven.

Example of text law that should/may become code somewhere: the senate vote to give pension to veterans that meet some criteria… But there already exist less known rules for some cases and they may be incompatible.

I think that coupled with some kind of prolog, it may help detecting inconsistencies early.


>It appears to add precision where it's mostly already clear - perhaps it can enforce some kind of rigor...

I'd argue that the imprecision of law is more feature than bug. Rules as written have edge cases and, as long as the law is written in natural language, you can get a feel for their intent and that helps Judges decide what to do in those situations.

[deleted by user]

Super interesting. I also think we would win with a kind of versioning system for laws, including a definite objective for a law from the time of it's creation, and constraints under which it should be questioned again


I was expecting Lojban, or something similar. I remember someone proposing Lojban as a tool that could improve the clarity of legal and contractual writing.


That's really cool!

I was puttering around with the idea of a ricardian compiler for legalese, basically a decompiler for something like this that could compile a legal text into clear logical rules. This would aid in proof checking for legal documents to ensure that they're compatible with existing law, that there are no (unintended lol) loopholes and the like. It would also be useful if you wanted to create self enforcing legal documents that can be enforced deterministically by machines, such as collateralized agreements, and finally, even though someone would still need to know legalese, it could make the development of such agreements easier for people and lower the bar tremendously.

I wonder if anyone has built anything like that, if these guys have, or if anyone has built other interesting ricardian compilers.


I think what I'd rather see is a standardize test suite format for laws that spells out the intentions.

Once I lived in a state that proposed a very simple anti-child porn law with good intent, but it was too simple. It read sort of like "anyone sending explicit pictures of minors from a cell phone will be guilty of conveying child porn". It was written in the proper legal jargon, but wasn't a whole lot more detailed than that. I called the sponsor of the bill and asked if that meant if my hypothetical daughter sent a naked picture of herself to her boyfriend, then wouldn't she be a felon under his new law? He had an "oh, crap, that's not what I meant!" reaction and ended up withdrawing the bill so it could be re-written. (Aside: I felt pretty good about that. Props to the legislator for being quick to understand and respond appropriately!)

Imagine if that were handled like program code, with a test like:

* This law does not apply to minors sending pictures of themselves.

That would do a few big things:

It would make legislators be clear about what they mean. "Oh, we'd never use this online child safety law to ban pro-trans content from the Internet!" "Great! Let's add that as a test case then." I confess that this is a deal breaker: politicians don't like being pinned down like that.

It would probably make it easier to write laws that reflect those intentions. "Hey, that law as written would apply to a 15 year old sexting her boyfriend! The code doesn't pass the tests."

Future courts could use that to evaluate a law's intent. "The wording says it applies to 15 year olds sending selfies, but the tests are explicit that it wasn't meant to. Not guilty."

I'm sure this couldn't happen for a hundred reasons, but I can dream.


I partly like this idea in theory, but believe it is literally 100% impossible to come up with a better "test suite" than "the actual court system?"


I don't want me or my male family members to be labelled sex-offenders whilst you "test" the "court system" to see if the laws work as intended. All because some overzealous prosecutor wanted to be "tough" on "toxic masculinity".


Oh, I mean, Black man here -- I 100% agree with you that there are serious and deep problems with how things are done now; I just have very little faith that any hypothetical nerd testing like we're talking about here will do much better.


If it was 'literally 100% impossible', you wouldn't need to believe it, you'd know it to be so.

As for test suites and courts - the two are complementary so there's need to compare them to one another.


The courts have to evaluate what they think the law's drafters meant: Yeah, it says this, but it's obvious the legislators didn't mean for it to be read that way. It'd be nice if there were footnotes that expounded on what the authors were trying to accomplish to help courts interpret the laws.


At least in the US, it's not the drafters' intent that matters, but the intent of the legislators who voted on it. (Legislators actually have lawmaking power. Drafters are usually unelected staff or even lobbyists.)

When a statute is ambiguous, courts do sometimes look at the congressional record (eg floor debates) to determine intent.


To add on that, there are several competing theories of statutory interpretation in US legal thought. It's a very complex subject.


Yeah, there’s no way that a modern computer could outdo the logical accuracy and processing power of our 300 year old legal system. Court rooms and arguing and paperwork, much more efficient than silicon.


With developments in AI I doubt that will stand the test of time.


Courts and lawyers are efficient at generating large pay checks for lawyers.

They are horrible at everything else.


Trying to measure it for "logical accuracy" and using ideas like "processing power" so very deeply demonstrates how little you understand what actually is happening.


You have a bunch of judges, none of which are immune to racism and corruption, giving out their version of the law from big oak benches while a suspect is held against their will in jail for however long. I think a computer can do better.


Mindblown. I went to law school and now work as a developer but never thought about it. Writing tests for laws should totally be a thing.


> Writing tests for laws should totally be a thing.

It is a thing already. In both the US and Germany it is common for lawmakers and regulators (e.g. the FCC which was here on HN to solicit comments a few days ago) to provide drafts of laws and regulations to interest groups so that these can raise issues they find.


Also free agent simulations for discovering unintended consequences


law tests would be good, though they probably have to be "evaled" via the same mechanism which would apply them. Meaning, courts :\

I personally would be happy if any country would attach rationale for the law to the law itself. And possibly some KPI to see if it works. So the law could be reevaluated later, to see if it works at all, or maybe counterproductive, or maybe some major actual application of the law is not why it was introduced.


99% of the work is in coming up with the edge cases, and in law the most common thing to do with edge cases is call them out explicitly. I imagine the legislator went back and added a clause to the law that specified "it shall not be considered a violation of this section for a minor to send photos of themselves".

Laws don't need to be computer-executable, they're about intent and the interpretation thereof, so the test suite itself is really part of the law and may as well just be embedded in it.


I think for something like this to be effective, you need the actual intent encoded correctly (so this use case wouldn’t have been solved), and lawmakers acting in good faith (i.e., not drafting legislation that’s intentionally vague such that it can cast a wide net and force people to use the courts to dispute things).


You don't need those voting for the bill to act in good faith if you have as part of the system of passing bills that the opposition gets to write the (adversarial part of the) test suite. Then either that forces any loopholes (or other undesirable effects) to be: updated as explicitly intended (with bad publicity and potential for reversion upon a change of government); or taken into account and the bill updated to reflect that, or left in the test case for case law to cite as intended.

Either way you really want intent to be encoded somehow.


Intent could be derived by parliament debates minutes.


Yeah, uhh, having seen some of the hearings in recent state legislatures regarding abortion, that’s some flawed thinking. Lawmakers are intentionally vague throughout the entire process sometimes.


It's great that your congressperson was enlightened enough to pull the bill. Some states (such as Minnesota) apparently actively prosecute children for sexting.



I was pleasantly surprised.

Minnesota is so bizarrely, irrationally wrong on that one. That poor kid needs an adult to sit her down and explain why sending out nude pics as a minor is a really bad idea, not to label her as a sex offender. Now, if someone (especially an adult) received those pics and shared them, go ahead and charge that person.



Catala: a programming language for socio-fiscal legislative literate programming - - Oct 2020 (37 comments)


Interesting that their “money” type doesn’t allow fractional cents or track the currency. And the only collection type is a fixed-sized list. I can imagine a lot of other useful base types that would be useful for laws about the real world, such as physical units like meters.


It’s good that they made it not turing complete but they should probably also force an upper bound on law complexity




Completely unrelated to Catalan (Català), the language spoken in Catalonia (Catalunya). I think if someone wants to google a question about this, "catala language beginner hello world" won't help them much.


I can only conclude that they didn't know that. It's such a bad name; we won't even be able to google "catala lang" because ... Catala is also a "lang"!

Imagine someone creates a programming language called Russian. Good luck googling "russian lang".


given that the major contributors seems to be from paris / bordeaux i am pretty sure they know about catalan.


> given that the major contributors seems to be from paris / bordeaux i am pretty sure they know about catalan.

Yes, but did they know that català is Catalan for Catalan? In French, as in English, it is catalan instead (well, English always capitalises it, French doesn't, but the spelling is identical).


You'd write "catalalang" just like you write "golang", if you really need. I think in most context, search engines would be able to infer the context.



[deleted by user]
[deleted by user]

Too have fun with laws/rule systems take a look at nomic



an example of the game rules:

In the context of this thread, I'm sad that the game rules don't appear to be in a constrained vocabulary

[deleted by user]

Something to keep in mind is that the courts are not necessarily trying to determine the truth, but rather create a place to allow two parties that represent different interests duke it out. Not always what the courts are used for, but it's a different mentality than science or programming.


A software for people. For dictating human behavior. Is that what we're looking at here?


Started something like this years ago with a company that ended up pivoting to a slightly different direction after a while. glad to see something in open source space.


Also see Logical English, a "Programming Language for Law and Ethics"


Naming is hard Catala is cardsharper (cheat) in russian.


Catala is also a human language.


No way would this ever be a good idea. Language changes too much, be it human or machine. This is why you need flexibility, not a rigid structure, in making law.


I'd be interested in seeing something like this for verifying game designs / new game rules given an existing design


Why? What do game designs have to do with law?


Games are activities bound by rules. Laws are rules for government/governed.

AFAIK There's not really a programming language specific for describing how players interact in a game, so although there's no reason you couldn't implement it in any old programming language. I guess the same thing could be said of the law too until Catala.


Wouldn’t that just be writing test cases against the business logic of your game?


I assume they're talking about tabletop/board games and such, not video games.


Ahh, that was the context I was missing. Indeed, makes sense for board games.


What are the authors' goals, what is their intended purpose? I can't find a mission statement on their website.


It would be interesting to also "weave" in test cases. The workface of logic statements is exactly where bugs are introduced.

Especially around temporal events, and that goes to formal models (and even more bugs).

Typically, if there is a rule around height, there would be at least three tests¹: one taller, one equal to, and one shorter. (Without types or something, then also negative, null, and max/min boundary inputs too.)

So you could have tests based on timelines, like

  Given a regulation is passed in 3 months
  And parties are prevented from exercising B
  But "17 tons" of waste are dumped anyway
  And ...
  When ...
  Then ...
Having a model checker integrated would be a boon. Maybe we could have DevOps-like pipelines in formally-verified legislature (or at least the encoding of language to code).



It will not compile (sarcasm).

Many laws are written with a lot of double meaning (recent eu regulations on allowing or not allowing russian cars is a good example).

Though, it could be a good idea to find all the possible double meanings or vague definition when trying to "digitise" the laws into the programming language.


Right, some laws are just ftp-dragged & dropped to prod, understanding that it doesn't compile yet. But if legislator managed to express the _intention_ the courts will over time add and make all test cases pass.

I'm sure the mentality of a PHP developer running a successful but insane legacy site is a better model for this than a perfect OCAML project :)


That would be nice also to have unit tests


This is great. I often look at unit tests to understand how to use legacy code. Seeing a law come with concrete examples that are part of the legal text, would allow us to test if a rule is logically consistent, and test our understanding as well. Sort of how you want to convince your reviewer that your code works by proving it with a unit/integration/e2e test.

End to end tests for legal frameworks?


imagine... TTD applied to laws.

Or take it a step further, write a test suite and automatically generate a range of possible laws that satisfy the tests.


For years I've tried to convince my lawyer friend that something exactly like this would be great to have, and then it turns out to have existed probably all the time.

I think this is truly awesome.

Every law should be written in a language like this, and presented publicly with syntax highlighting and consistent formatting rules.

Then it should be made part of the school curriculum to learn the law language.

I believe it would greatly improve everyone's ability to read laws and be confident about their understanding of them, which would be a huge boon for society.


This is quite a misunderstanding of how the law actually works, probably enhanced by lots of lawyer TV emphasizing obscure wording tricks.

In reality, laws are already written in a relatively normal language, and the words almost always mean exactly what they mean in plain English. The only problem is that the legal concepts they describe are themselves complex, and often they end up in a tangle of references to other laws and regulations and legal precedent etc.


I don't know if what you say about laws is correct, but it's certainly not a correct description of contracts. The general public is constantly confronted with utterly unreasonable legal documents: far too long, far too unclear, far too complex. The lawyers writing these, and the entities paying them, both know that the public will never read or understand them. It's pure cinicism.


Sure, but that wouldn't change if they were written in code. It would probably get much worse, in fact.


I don't think so. Any state progressive enough to adopt law-as-code might also put in place laws limiting the length and complexity of the code. It would be easier to control this aspect as code I believe.


By that same token, any state progressive enough to think about law-as-code might first put in place measures to limit the length and complexity of contracts written in plain language as well.

The problem here is the length and complexity, not the language used to express the contract. And note that law-as-code would necessarily mean that a layman is fully unable to understand a contract (or the text of a law) at all. They would be fully reliant on a specialized worker to explain the code to them in plain English. Current contracts, if you have the patience to read and map them in full, are fully understandable by anyone with a good knowledge of the English language.


I think this is a big part of it. My sense (as a non-lawyer who has looked at a fair number of laws and contracts) is that, in addition, there are plenty of laws and contracts that are just poorly written and wording or constructions that lawyers have retained out of caution or traditionalism. This last case, traditionalism and caution, is maybe a special case of the other cases, but it's not always obvious.


I believe that another thing that happens, and it is common in every field, is that certain constructions are very common within the field, and so the need arises among practitioners to shorten them. Most domains invent new words, but this doesn't work for laws and contracts (since they need to at least in principle be understandable to non-practitioners, such as most elected officials).

So instead of full-on jargon, legal texts get enshrined phrases, which practitioners can essentially skip over, but which also retain some meaning in plain English (though often sounding antiquated).


I worked for a company that translated certain kinds of legal contract into what was effectively a DSL. They could then be represented in a simplified way. That was the whole business.

The CEO (a lawyer) claimed that the lawyers who wrote these things would deliberately and unnecesarily overcomplicate them so that they could maximize billable hours.

I think they could quite easily have been templated using a DSL but that DSL would need frequent maintenance. These types of contracts did evolve as new types of clauses and legal constructs popped up and gradually evolved from "new" to "standard" to "boilerplate".


The text of a contract can either be long and very explicit, or short but full of implicit assumptions. A DSL is the second kind: you encode those assumptions I the structure of the DSL and the text as written is based on all of those assumptions.

The problem than becomes that anyone who wants to understand the contract now has to read not just the contract as written, but also all of the definition of the DSL itself. This can actually be OK if the DSL is very commonly used, such as a DSL for contracts between two parties which sign new contracts every day.

But it is a huge waste of time for parties which rarely sign contacts, and is often used as an explicit moat to keep laymen from participating. If I give you a contract to sign that isn't even written in plain English, you will have no choice but to hire a lawyer specialized in understanding this contract DSL to advocate for you.

I imagine (savvy) lawyers actually love DSLs that purport to make contracts concise.




That's absolutely not correct. A DSL is not necessarily short and implicit. It can be very implicit or very explicit and the one I worked on was explicit. Its defining feature would be that it is straitjacketed.

The customers in our case did not actually look at the DSL - it was entirely internal. We decompiled the legal document into the DSL so that we could then represent the contract in more understandable ways.


If you can't read documents in whatever form for their legal meaning, you can't work around the need for a lawyer. The DSL may be defined in comprehensible enough language and texts in it may be interpretable easily enough; but the method of contract is determined by agreement of its parties (inside the bounds set by law).


Currently, contracts are judged by their meaning in plain English, with any additional definitions being stipulated in the contract itself (either explicitly or as part of the verbal agreements that accompanied the negotiation of the contract).

A DSL is an extra layer of abstraction above that. If you agree to a contract written in some DSL, then you must also agree to the way that DSL translates into plain English. To significantly compress a legal contract that is not deliberately written to obfuscate its meaning, the DSL has to pack a lot of precise meanings into every term, making it very dense and hard to parse unless you're well-versed in it.


A contract is anything two parties agree to with some consideration (benefit) exchanged. The law does not distinguish between 'proper' contracts and informal ones (like a handshake) except a proper one may be quicker to execute. And this is a feature... You wouldn't want to force everyone to undertake the cost of developing highly specialized legal products just to do business.


As someone who has tried to read the Black’s law dictionary once, I must tell you that the law does use a lot of formal language.


It does, to some extent, but it's still much closer to natural language than code. Expressing the law in code would turn this dial up to 11 and then some.


Sure, but there would be all manner of services that could automatically translate law-in-code to any human language you wanted. The best part is that you could easily and automatically translate a law into, say, English, Japanese, and German. Whenever the law-in-code source changes, just rerun your translator and voila: no human intervention required (meaning faster and more accurate translations of the law into human-readable language).

You could even program the "law-in-code-to-humanspeak" translator to generate different levels of the target languages, e.g. translate into something at a 6th grade reading level vs. something at a grad school level. Again, the advantage would be the automaticity.


But now instead of reading the laws that govern my life myself and understanding them, I have to rely on some third party to explain to me what laws my representatives are voting on.


The words almost always mean almost exactly what they mean in plain English. That’s why the law is a huge mess.

As a fun exercises, try to find how many definitions of “child” there are in the US law and how many times it’s used undefined.


> As a fun exercises, try to find how many definitions of “child” there are in the US law and how many times it’s used undefined.

I don't see how a language is going to solve this. This isn't really an issue of language, it's an issue with ambiguity in the very intent itself. That's why we have judges, who interpret that intent.


Indeed. Ambiguity in legal codes is a feature that allows them to remain relevant for more than a couple of years, and the formal-law "utopia" that some commenters here appear to desire would be a nightmare if put into practice.


They mean disputes go on for years, costing society a lot and enriching intermediaries.


Exactly, it is truly horrifying to think of a law encoded such that it can not be superseded by human interpretation.


Personally I'd rather have laws that work, are consistent and easily understood if you can follow 'if A then B' logic, even if they have to be updated more often.

Relying on ambiguity is admitting there are no laws, and we rely on the common sense of the people in thr judicial system.


> Relying on ambiguity is admitting there are no laws, and we rely on the common sense of the people in thr judicial system.

What's better:

- relying on the common sense of the people in the judicial system to interpret the intent of the law as it applies to a particular situation,

- or relying on the common sense of legislators to write a precise, unambiguous law that will cover all possible situations without negative unintended consequences, and without the law-writing process being influenced by spureous interests and pressures?

The answer highlight why the law is interpreted as is now, with a body of trained public servants analyzing the particulars of each situation, rather than by fanatics trying to follow the letter of the law.

Languages like this may help clarify the intent of legislators so that it is not twisted by clever lawyers; but a human reviewer should always check the relevance of one law to any particular case, and wether the text corresponds to the original intent as applied to a given situation.


This is a solved program in programming, you hover Child and hit "Go to definition". If it's ambiguous, the lawmakers would get a compile error instead of pushing a broken law. They may even have to pay us for consulting to fix it, depending on how esotrtic the language is, how nice is that!


Legal codes aren't meant to work like programming languages. It is impossible for a legislator to predict how the world will work when their law is applied, and it is highly unlikely that they will anticipate every situation and context in which their law will be invoked.

Judges and juries and lawyers all exist to help us interpret the inexact legal code in a way that is (hopefully usually; but obviously not always) fair and reasonable given the often-nuanced situations at hand.


That doesn't add up with how the average piece of legal code looks. Even when I read a brand new piece of legislature it's an impossible soup of words, attempting to document every possible edge case the legislators thought of. If the point was to leave room for interpretation by a judge and you concede yoy don't have full context of how what you're writing will be applied, surely you could write much more sensible and human-readable text.


It adds up to how the average piece of legal code works in practice.

Law is political. It's persuasive, not deterministic. It often comes down to a judgment based on the relative political power of the entities in question.

Even if you find a statute that says very clearly that X is unlawful, there will be situations where a lawyer will argue that it isn't.

Sometimes they'll make that case successfully - for various possible reasons, not all of which will be lawful themselves.

This is one reason why statute law is expanded by case law. And good luck trying to automate case law.


I would be happy if the law conforms to other existing laws, not future edge cases. The solution being proposed here would do that, while the existing system would not.


It isnt a bug it is a feature. There is a point in law sometimes being a bit vague to create flexibility


Being vague and up to interpretation because lawyers can’t foresee all possible circumstances is a feature.

Being a mess riddled with inconsistencies is not a feature.


> As a fun exercises, try to find how many definitions of “child” there are in the US law and how many times it’s used undefined.

Which is why it is the common standard in German law to define every possibly unclear term either in the relevant section of the law itself or in an introduction article.


It's the standard in the US, too, but "child" is a case where plenty of lawmakers have assumed it couldn't possibly be misunderstood, while plenty of others have defined it with contradictory definitions in their respective sections.

The latter is better than the former, but it also makes it even harder to interpret sections with no definition.


I don't agree that the law is a huge mess. It is certainly far less of a mess than any code base I've ever seen, given the gigantic scope of what it applies to, and how many people if affects.

Note that the legal system is indeed a huge mess, but that happens because of many other reasons - not a problem with the wording or vagueness of the law, but with the explicit (malicious) intentions of law-makers, judges, police and others involved in the whole process.

For your example of "child": how often does it actually cause a problem in practice? How many people have been improperly punsihed/set free because of a poor interpretation of the word "child" in a specific law? This is far more relevant than every law taking up valuable space to define what such a common word means.


How do you codify the effects of judicial review? What if the court's decision is not binding but is still persuasive -- such as when it's a decision from another jurisdiction?

Common law has like 800 years of tech debt. It's turtles all the way down. And by turtles I mean precedent, and not all of them are compatible.

None of what I've described is malicious. It's just what happens when law meets the messiness of the real world.


> How many people have been improperly punsihed/set free because of a poor interpretation of the word "child" in a specific law?

How many improperly punished people is good enough? How many cases go to Supreme Court because the amount of needless ambiguity just adds up, one word at a time?

> This is far more relevant than every law taking up valuable space to define what such a common word means.

Right? Why do many laws redefine it?


I don't know how many is good enough. If it's every other person, than that's bad; if it's two people since the law was written 50 years ago, I would say that's good enough in my book. Which is it?

And no, cases don't often make it to the Supreme Court because the wording of the law is ambiguous. They make it to the SC because the parties disagree on legal principles and on whether laws are unconstituional or not.

> Right? Why do many laws redefine it?

I would have to see some specific examples to judge for myself. Still, this seems to be the opposite problem compared to what was raised earlier. So which is it? Do we want laws to be more explicit about their exact definitions of words, or more implicit?


> And no, cases don't often make it to the Supreme Court because the wording of the law is ambiguous. They make it to the SC because the parties disagree on legal principles and on whether laws are unconstituional or not.

I'm not from your legal system, and yet even I know that's not right: they make it the SC because the 'losing' party disagrees with a lower judge's decision that's already been made, and makes an argument compelling enough in appealing it that it needs to be reconsidered. (A few times, to get as far as the SC, probably.) That almost has to be because of some 'ambiguity' - the lower judge decided one way and the appeal is 'well no I don't think that's the correct reading'.


Note that the SC almost by definition only looks at cases that pertain to the constitution, or to base legal principles, not to individual laws. And good luck codifying the constitution itself and/or the rules of common law into formal math.

Also, appeals essentially mean that at least one of the parties believes that a judge made a mistake in the way they applied the law, not necessarily that the law itself is ambiguous. A judge can fail to apply a perfectly unambiguous law, or at least a plaintiff can believe that they did, and can bring enough evidence that their belief has some merit.


The GP is taking a restricted meaning of "ambiguity", where the ambiguity is within the laws relevant to the case being appealed.

But cases that go to the Supreme Court of the United States are often cases where the "ambiguity" is on how the US constitution should be interpreted and applied.

In a practical sense you're not going to be able to codify the US constitution into code. It's even a well received "feature" that constitutions are sometimes a bit ambiguous, see:


> And no, cases don't often make it to the Supreme Court because the wording of the law is ambiguous. They make it to the SC because the parties disagree on legal principles and on whether laws are unconstituional or not.

I don’t see how those are exclusionary principles. Also, I have never heard about this case, but after some simple googling I learned that the Supreme Court indeed had to figure out the definition for children in a particular law:

> Still, this seems to be the opposite problem compared to what was raised earlier.

No, it is the same problem I mentioned before: some laws define it in a contradictory way; some laws don’t define it. I told you it’s a fun exercise!

> Do we want laws to be more explicit about their exact definitions of words, or more implicit?

I want laws to make sense. Inconsistency doesn’t make sense and only brings troubles. Vagueness is good. Ambiguity is bad.


> ..and how many times it’s used undefined

This is the real advantage of codifying law into a programming language. You can have validation and assertion that is automated. And a strict structure, free from ambiguity.

As an additional advantage, multilingualism becomes more accessible, with the codified program/definition acting as the lingua franca of law. Thus, someone who only knows English could make sense of Japanese laws by reading it.


I cannot imagine some system that goes from a formal language to reality not having ambiguity. Even going from mathematical formalism to mathematical truth you can’t get all the truth. I imagine getting all the justice would be harder.

And I have heard lawyers explain that sometimes ambiguity, in contracts at least, is a good thing as it reduces the amount of a priori negotiation for low probability events. The the low probability event happens and the contract is ambiguous then you negotiate at that time and maybe sue.

And for laws, I think a bit of flex in the system probably would be a good thing. Give some scope for local judgment an autonomy to the people closest to the situation.


Why would you want every law that applies to children to explain what a child is? Why stop at child, perhaps each law should include a definition of every word it contains, right? That would certainly make every law much more readable for the masses.

The text of the law is meant to be understood by the people that it applies to, i.e. everyone living in the locality which passed said law. Expressing law in a formal language goes directly against that goal. Imagine if a EULA you get presented with, instead of being a wall of repetitive text, would be a wall of code with symbols you at best remember from some class you took in 8th grade.


Having EULAs and many types of contract be code would be a huge upgrade.

For one, it'd make them a lot shorter. You could use inheritance or composition to refactor out repetitive boilerplate, which is 90% of what EULAs are. The thing you see would only be the places where it deviates from a base EULA that you could study once.

For another, it would catch bugs automatically. I have caught bugs in contracts drafted by lawyers a bunch of times, just by reading them carefully. For example numbers that are stated in both words and digits but they don't match. References to clauses that no longer exist. Statements that are contradictory.

A properly written language could be compiled to English for people who for some reason can't read the "real" language. But a well written PL for law would be quite readable.


Programming languages and other formal languages add boilerplate, they never remove it. Compare pseudocode for an algorithm with the actual implementation: it will almost always be much shorter.

All of the rest of what you mention could be achieved with plain language contracts exactly as well. Nothing prevents the software industry from getting together and producing a base EULA that all others refer.

Except of course for the fact that it would be utterly impossible to convince companies to agree to such an endeavor, whether in code or plain language or any other way. Especially since the purpose of EULAs is not to be clear, but to confuse end users with verbiage.


I don't watch that kind of television much, but since I'm not a lawyer, my view of the world is probably too simplistic.

Still I'd argue that normal language is very poor at handling that tangle of references.

A good programming language would make those references very easy to untangle and present in their untangled form.

When I've read (Danish) laws, I've often thought that they would read better as if statements.

It's not that I think those laws are written in legalese, it's that they are expressing logic in a suboptimal way. Like how "four plus four equals eight" is a suboptimal way to express what could be expressed with 4+4=8.


> A good programming language would make those references very easy to untangle and present in their untangled form.

Not necessarily. Notation can only do so much to help with understanding. To understand 4+4=8 you still need to understand what's a number, what addition means, and what it means for two numbers to be equal. The same problem applies to the law, and it takes far more time to understand legal concepts than the actual wording.

Additionally, the law is not supposed to be some arcane discipline that you need to learn a new language for. The law is decided on by, and applies to, people who are not and have no reason to become legal experts. It is simply a statement of the rules by which we try to live.

If laws were written in code, they would actually become much, much harder to understand than they already are for the vast majority of their audience. Imagine a public debate about a law where the text of the law was, instead of plain(ish) English, Haskell code. Imagine news anchors explaining that Biden agreed to add the lambda sign, but was heavily criticized by McConnell for his use of a monad instead of a plain for loop.


> Additionally, the law is not supposed to be some arcane discipline that you need to learn a new language for.

It would seem to me that reading laws has become an arcane discipline, partly due to it being expressed in a language with overly long sentences, which handles branches and references very poorly from a readability perspective.

> Imagine news anchors explaining that Biden agreed to add the lambda sign, but was heavily criticized by McConnell for his use of a monad instead of a plain for loop.

While that would surely be interesting to watch, I think we both know that's not what would happen.

Like I wouldn't ask you "number four plus sign number four equal sign X?"


I assure you that no one who fails to parse long sentences would get a better understanding from replacing those with code of all things.

And if the actual text of the law consisted of coding symbols, I very much expect that (a) you'd have endless debates about the precise symbols being used, and (b) have to have anchors going over the meaning of those symbols and losing 9/10ths of their audience along the way.


You are mentioning symbols as a negative in a lot of your comments, but the language posted here is mostly using words and math operators.


If a word is used with a precise formal meaning, than it is a symbol more than a word. For example, "for" in C isn't the English word "for", it is a symbol for a precisely defined operation. Someone who speaks native English couldn't understand what `for (;;){}` means.


IANAL but from my experience reading laws, they're written in a way that looks like it's trying to replicate programming-language-esque nested logic, but in prose format—instead of using physical layout to establish the relationships between concepts, they use words, which I find more confusing. I would rather read laws written in a more structured format.


You (as a software engineer or similar) have been trained to read the 'more structured format' you'd prefer; lawyers have been trained to read 'prose'.

Don't get me wrong I'd prefer it too, I just recognise it's a consequence of my own education and experience, and if we somehow flipped the switch overnight most lawyers would be completely baffled and pining for the much clearer old way.


But all structured format use words, and they're many programming language that are prose since COBOL. No one want to write laws in APL or assembly like.


The US Code is divided into numbered titles, sections, paragraphs and subparagraphs which can link to and incorporate each other by reference. This tree structure is reflected in typographic conventions which visually distinguish operative provisions from headings, indices and editorial notes. Isn't this the same kind of "structured format" as a code repository made up of libraries, modules, source files and functions?


If you read almost any state or federal statute, it's usually presented online in a tree format where definitions link to the defining article and referenced articles are hyperlinks to those sections.

It's all very readable.


Then you should be able to tell us what defrauding someone of their intangible right of honest services encompasses :)


Whatever statute you're looking at likely defines "intangible right of honest services" and "defrauding."


> For the purposes of this chapter, the term “scheme or artifice to defraud” includes a scheme or artifice to deprive another of the intangible right of honest services.

That's it.


Is that supposed to be very complex legal language? I'm not sure if this is some reference I'm missing, but it sounds relatively simple to me: it seems to be an accusation that someone is providing services dis-honestly (i.e. hiding some aspect of how the service will be performed or how much it will cost).

The full scope of what that encompasses is very hard to know, as it depends on essentially every other regulation and common law practice and precedent that applies in the particular legal jurisdiction (and which jurisdiction that is can itself be a somewhat thorny issue).

But this problem isn't solvable by code. It's part of the intrinsic complexity of the legal system.


And the links! OMG, you could trace precedents and justifications and citations all the way back to the ur-utterances of Hammurabi. Reading that stuff could be quite educational.




Am actually working on a project like this. Not quite to the level of insanity you're thinking about, but certainly a couple of thousand years.


Former R&D director of legaltech company here. Lawyers already use tools that link everything.

For instance in the U.S., the Westlaw search engine for legal cases is such a tool, it can parse legal citations and turns them into hyperlinks. It rankes cases given a query based on a state-of-the art machine learning based IR method, and it is aware of cases in the ranking that are currently overturned by higher courts (shown as red flags) or being reviewed (shown as yellow flags).

The software can also predict the outcome of a legal case and recommend courses of action that makes winning more likely. It includes ruling statistics about all sitting judges.

Curiously, lawyers like to see the world as static, they do not like e.g. search results to differ between sessions, but of course cases get decided dynamically every day, which must necessarily also change the search results for any given query.

If people want to see what is the state of the art in AI and the law, I recommend you have a look at "AI & the Law" (ICAIL, the annual conference and the journal of the same name).


A big problem is you lose all the case law that interprets the law. Like discarding two centuries of bug reports and patches.

At the same time, laws have been codified (i.e. the case law rewritten coherently, combining all the patches combined), such as the Uniform Commercial Code.


Although in principle I agree with you, the law generally depends too much on interpretation and precedent to be expressed and understood like you’re hoping for.


That information should be available in summarized anonymized form, referenced from the law language.

The judiciary should keep it up-to-date.




I know that naming is hard, and it has already been mentioned in the comments here but... I can't believe that somebody named a programming language with the exact same name as an existing natural language spoken by millions of people.

It just seems like a bizarre decision that can't be a benefit at all and can only have negative consequences. Just googling things about it is going to be hard. Why immediately create potential problems for yourselves when you can choose a name that's not an issue?


goofy; the language you're talking about is Catalan of Catalonia, not `catala`.

> Just googling things about it is going to be hard

when you're looking for docs on go do you google just "go"?

edit: fine, it's called catalá in catalonian itself - this is so pedantic now that i might as well at this point say that the missing diacritic is sufficient to disambiguate.


Catalá is the name of the language in Catalan, is totally equivalent to saying "English" if you are a native.

This is obviously a not innocent choice. At this level I don't believe in coincidences and CatalaLang makes it even more obvious. This looks like a veeeery obvious psy-op, or a independentist version of the old embrace, extend, extinguish.

My bet is that as they can't stomach the basic legal concepts, they will try silently replace it by the new "updated" meaning of those concepts.


Def a psyop. Classic Catalonian move of seizing independence by writing self determination laws in code and carefully introducing a bug that they can then exploit to secede.


it's catalá in catalá


It's actually "català" not "catalá". Source: I'm català.


> when you're looking for docs on go do you google just "go"?

Golang will do the trick. Catala lang will not unless the language becomes massively popular.


try catalalang right now behind incognito mode and tell me what you get




> when you're looking for docs on go do you google just "go"?

I wouldn't use Go as a good example of naming a language. It worked out because the language had the weight of Google behind it, but it's still awkward that you have to use a different name when searching for things than you do at other times.


> I wouldn't use Go as a good example of naming a language. It worked out because the language had the weight of Google behind it

this is called the no true scotsman fallacy - "I'm still right in XYZ case because XYZ isn't a real instance of ABC (the thing I'm making a claim about)"


No, it's not, because I didn't make a universal claim about anything. A No True Scotsman fallacy must follow a overly-broad No Scotsman statement.

All I said is that Go, specifically, is an awkward name that probably shouldn't be used to justify further awkward names.



[deleted by user]



What? Why?

[deleted by user]

The flagged post didn't put it eloquently but the sentiment is right, and complaining about naming violates HN rules about not complaining about tangential issues. As it is they've at least temporarily ruined the discussion by having the top post be feigned concern about something that's not at all material to the content.


This hasn't seemed to help or hurt the popularity of other languages. You've got hot beverages, single letters, snakes, gemstones, two letter verbs, oxidized steel, languages where two thirds of the name are symbols, etc. It doesn't seem to matter. It appears that society, and search engines, are well-equipped to deal with the concept of homonyms.


All of the things you mentioned are not languages themselves. If you google "catala language" (try it! seriously) you're going to get results for the natural language, not this one. It's just an unneeded roadblock that they placed on themselves.


I don't think I've ever typed the words "Java language", "C language", "Go language", etc. except in this comment.


They're even languages named: basic, pascal, java, rust, go, zig, dart, eifel, camel, python, ruby, julia, scheme, racket, joy, mad, coq, lean, ...

[deleted by user]

Came here to share the same thought.




This is really cool! But I guess it’s biggest drawback is being unable to deal with case law?


Although I agree on principle, the closest thing we have to 'formal law' are smart contracts, and already billions of dollars have been stolen from bugs in these, despite barely anyone using them. I have some reservations for basing our entire legal system on code.


I think that comparison is way off.

Smart contracts are much more comparable to "a webshop", than actual logic describing rules of arbitrage or other concepts at play in "law".


A smart contract is a well-defined series of rules, that anyone can choose to interact with and have certain well-specified guarentees of the outcome -- in that sense I think there are quite a few similarities, the main differences being that a smart contract deals with a way narrower set of concepts and is immutable.


Part of the problem with smart contracts is the term usually means Ethereum programs, which are usually neither good law nor good programming. They're not only playing a game with high stakes but the actual programming environment is a textbook case of how not to do computing. You can't generalize much from the experience there, it'd be like judging all of computing by looking at piles of PHP written by drug-addled teenagers.


this is peak hacker news 'solve human conflict by using technology'


I would have a much easier time understanding government legislation if everything was provided in such language. I tried to compute my taxes by hand a month ago to see whether and how much money it would save if I enabled "loon middeling" (some Dutch law about income). But I couldn't figure it out. The explanations provided were ambgiuous in some subtle way, leading me to incorrect assumptions. In the end I did figure it out by reverse engineering a free third party calculation tool (which also was not correct, but putting their insight and my insight together made something that came close to the number on my belastingaanslag).