This is superb. Thank you for making it and licensing it MIT. I think this is a contender to replace the lexer within jank. I'll do some benchmarking next year and we'll see!
Interesting, I had to look up what EDN is. Important to note that EDN doesn't have a concept of a schema like JSON Schema.
This is a `map`, which bears semblence with a Json object. The following might look like an incorrect paylood, but will actually parse as valid EDN:
{:a 1, "foo" :bar, [1 2 3] four}
// Note that keys and values can be elements of any type.
// The use of commas above is optional, as they are parsed as whitespace.
One of a key design principles in EDN is to be exclusively data exchange format. Which is true even for JSON where json-schema is something that sits on top of JSON itself. Same goes to EDN - in Clojure there is clojure.spec that adds schema like notation, validation rules and conformation. https://clojure.org/about/spec , something like this could be implemented in other languages as well.
Can the metadata feature be used to ergonomically emulate HTML attributes? It's not clear from the docs, and the spec doesn't seem to document the feature at all.
I think it would be better to not use Unicode (so that you can use any character set), and to use "0o" instead of "0" prefix for octal numbers. Also, EDN seems to lack a proper format for binary data.
I think ASN.1 (and ASN.1X which is I added a few additional types such as key/value list and TRON string) is better. (I also made up a text-based ASN.1 format called TER which is intended to be converted to the binary DER format. It is also intended that extensions and subsets of TER can be made for specific applications if needed.) (I also wrote a DER decoder/encoder library in C, and programs that use that library, to convert TER to DER and to convert JSON to DER.)
ASN.1 (and ASN.1X) has many similar types than EDN, and a comparison can be made:
- Null (called "nil" in EDN) and booleans are available in ASN.1.
- Strings in ASN.1 are fortunately not limited to Unicode; you can also use ISO 2022, as well as octet strings and bit strings. However, there is no "single character" type.
- ASN.1 does have a Enumerated type, although the enumeration is made as numbers rather than as names. The EDN "keywords" type seems to be intended for enumerations.
- The integer and floating point types in ASN.1 are already arbitrary precision. If a reader requires a limited precision (e.g. 64-bits), it is easy to detect if it is out of range and result in an error condition.
- ASN.1 does not have a separate "list" and "vector" type, but does have a "set" type and a "sequence" type. A key/value list ("map") type is a nonstandard type in ASN.1X, but standard ASN.1 does not have a key/value list type.
- ASN.1 does have tagging, although its working is difference from EDN. ASN.1 does already have a date/time type though, so this extension is not needed. Extensions are possible by application types and private types, as well as by other methods such as External, Embedded PDV, and the nonstandard
- The rational number type (in edn.c but the main EDN specification does not seems to mention it), is not a standard type in ASN.1 but ASN.1X does have such a type.
(Some people complain that ASN.1 is complicated; this is not wrong, but you will only need to implement the parts that you will use (which is simpler when using DER rather than BER; I think BER is not very good and DER is much better), which ends up making it simpler while also capable of doing the things that would be desirable.)
(But, EDN does solve some of the problems with JSON, such as comments and a proper integer type.)
This is a tagged literal that can be read by provided (if provided) custom reader during reading of the document. The result could be any type you want.
This is superb. Thank you for making it and licensing it MIT. I think this is a contender to replace the lexer within jank. I'll do some benchmarking next year and we'll see!
Wow, that is a greate news!) Thanks for looking at it from this perspective! There are some benchmarks already available in the project - https://github.com/DotFox/edn.c/blob/main/bench/bench_integr...
you can run it locally with `make bench bench-clj bench-wasm`
Let me know if I can do anything to help you with support in jank.
Oooo that’d be nice.
Interesting, I had to look up what EDN is. Important to note that EDN doesn't have a concept of a schema like JSON Schema.
This is a `map`, which bears semblence with a Json object. The following might look like an incorrect paylood, but will actually parse as valid EDN:
If one wants to exchange complex data structures, Aterm is also an option: https://homepages.cwi.nl/~daybuild/daily-books/technology/at...Some projects in Haskell use Aterms, as it is suitable for exchanging Sum and Product types.
One of a key design principles in EDN is to be exclusively data exchange format. Which is true even for JSON where json-schema is something that sits on top of JSON itself. Same goes to EDN - in Clojure there is clojure.spec that adds schema like notation, validation rules and conformation. https://clojure.org/about/spec , something like this could be implemented in other languages as well.
Can the metadata feature be used to ergonomically emulate HTML attributes? It's not clear from the docs, and the spec doesn't seem to document the feature at all.
I'm not sure how the metadata syntax works, but you might not need it because you can do this:
I think you can use metadata to model html attributes but in clojure people are using plain vector for that. https://github.com/weavejester/hiccup
tl;dr first element of the vector is a tag, second is a map of attributes test are children nodes:
[:h1 {:font-size "2em" :font-weight bold} "General Kenobi, you are a bold one"]
I think it would be better to not use Unicode (so that you can use any character set), and to use "0o" instead of "0" prefix for octal numbers. Also, EDN seems to lack a proper format for binary data.
I think ASN.1 (and ASN.1X which is I added a few additional types such as key/value list and TRON string) is better. (I also made up a text-based ASN.1 format called TER which is intended to be converted to the binary DER format. It is also intended that extensions and subsets of TER can be made for specific applications if needed.) (I also wrote a DER decoder/encoder library in C, and programs that use that library, to convert TER to DER and to convert JSON to DER.)
ASN.1 (and ASN.1X) has many similar types than EDN, and a comparison can be made:
- Null (called "nil" in EDN) and booleans are available in ASN.1.
- Strings in ASN.1 are fortunately not limited to Unicode; you can also use ISO 2022, as well as octet strings and bit strings. However, there is no "single character" type.
- ASN.1 does have a Enumerated type, although the enumeration is made as numbers rather than as names. The EDN "keywords" type seems to be intended for enumerations.
- The integer and floating point types in ASN.1 are already arbitrary precision. If a reader requires a limited precision (e.g. 64-bits), it is easy to detect if it is out of range and result in an error condition.
- ASN.1 does not have a separate "list" and "vector" type, but does have a "set" type and a "sequence" type. A key/value list ("map") type is a nonstandard type in ASN.1X, but standard ASN.1 does not have a key/value list type.
- ASN.1 does have tagging, although its working is difference from EDN. ASN.1 does already have a date/time type though, so this extension is not needed. Extensions are possible by application types and private types, as well as by other methods such as External, Embedded PDV, and the nonstandard
- The rational number type (in edn.c but the main EDN specification does not seems to mention it), is not a standard type in ASN.1 but ASN.1X does have such a type.
(Some people complain that ASN.1 is complicated; this is not wrong, but you will only need to implement the parts that you will use (which is simpler when using DER rather than BER; I think BER is not very good and DER is much better), which ends up making it simpler while also capable of doing the things that would be desirable.)
(But, EDN does solve some of the problems with JSON, such as comments and a proper integer type.)
> EDN seems to lack a proper format for binary data
The best part of EDN that it is extendable :)
#binary/base64 "SGVsbG8sIHp6bzM4Y29tcHV0ZXIhIEhvdyBhcmUgeW91IGRvaW5nPw=="
This is a tagged literal that can be read by provided (if provided) custom reader during reading of the document. The result could be any type you want.
I'm grateful for this! Love seeing EDN find its way into new places.
Very nice. Is there a plan to have an EDN writer in C as well?
Yes, plan is there but didn't have time yet. Most likely will be available next week
A very impressinve implementation with SIMD and WASM!
[dead]