boehnenelton2024 - BEJSON Quick Guide

BEJSON Fundamentals

The cornerstone of the BEJSON standard is Positional Integrity. Unlike standard JSON, which relies on repeated key-value pairs, BEJSON enforces a strict, index-based mapping. The order of field definitions in the Fields array dictates the exact sequence of data in every record within the Values array.

Index 0: Fields[0] (e.g., "ID") maps to Values[n][0]
Index 1: Fields[1] (e.g., "Name") maps to Values[n][1]
Index 2: Fields[2] (e.g., "Status") maps to Values[n][2]

This alignment allows parsers to access data via direct memory offsets, ensuring maximum efficiency for high-throughput automation.

Structural Requirements

To ensure compliance with the BEJSON standard, every document must include six mandatory top-level keys that define its identity, schema, and data content. These keys provide the necessary metadata for automated systems to parse the file without external dependencies.

Format: This key identifies the file as a BEJSON document and must always contain the exact string value "BEJSON".

Format_Version: This key specifies the structural variant of the standard being applied, such as "104", "104a", or "104db".

Format_Creator: This key attributes the standard to its author and must be set to the string "Elton Boehnen".

Records_Type: This key is an array that declares the specific names of the entity types represented within the dataset.

Fields: This key contains an array of objects that defines the name and data type for every field, serving as the document's internal schema.

Values: This key holds a two-dimensional array of the actual data records, where each entry must strictly follow the positional order of the Fields array.

Standard vs. Automation Versions

The BEJSON ecosystem provides specialized formats tailored to specific data processing needs. While both versions maintain the core principle of positional integrity, they differ significantly in their structural flexibility and data complexity constraints.

BEJSON 104 (The Foundational Standard): This version is engineered for strict, single-type data enforcement. It supports a wide range of data types, including complex structures like nested arrays and objects. However, 104 is rigid regarding metadata, forbidding any custom top-level keys beyond the six mandatory system headers. It is the preferred choice for high-throughput data pipelines where data richness and schema rigidity are paramount.

BEJSON 104a (The Automation Specialist): Designed for speed and contextual awareness, 104a allows developers to include custom top-level headers for rich, file-level metadata (such as Server_ID or Environment). To achieve maximum parsing efficiency, it restricts record data strictly to primitive types: string, integer, number, and boolean. This makes it ideal for configuration management and real-time system logging.

In practice, the choice between these versions depends on the complexity of the records and the need for external context. Use 104 for deep data modeling and 104a for lightweight, high-speed automation tasks.

104db: The Database Format

The BEJSON 104db specification represents the most advanced tier of the ecosystem, transforming a standard data interchange file into a portable, self-contained relational database. While the 104 and 104a versions are restricted to a single entity type per file, 104db breaks this constraint, allowing developers to store multiple, diverse datasets—such as Users, Assets, and Logs—within a single document while maintaining strict structural validation.

The Record_Type_Parent Discriminator

The architectural cornerstone of the 104db format is the Record_Type_Parent field. In this version, the first entry in the Fields array is non-negotiable: it must be defined as Record_Type_Parent with a type of string. This field acts as the first index discriminator for every record in the Values array.

Index 0 Identification: For every row of data, the value at index 0 explicitly tells the parser which entity type that specific record represents.

Heterogeneous Data Streams: Because the discriminator identifies the type at the start of the row, a single file can interleave different record types. For example, Row 1 might be a "User" entity, while Row 2 is a "Permission" entity.

Relational Modeling and Logical Keys

By supporting multiple entities, 104db enables sophisticated relational modeling. Automated systems can navigate complex data graphs by using standard ID-based referencing, effectively mimicking the Primary Key (PK) and Foreign Key (FK) relationships found in SQL databases.

In a typical automation workflow, a "Server" record might have a unique server_id. Subsequent "Maintenance_Event" records within the same file can include a server_id_fk field. Because the document is self-describing, an automated parser can reconstruct the relationship between the event and the server without querying an external database, ensuring that the data's full context remains intact during transit or archival.

Structural Integrity and Null Enforcement

To maintain positional integrity in a multi-entity environment, 104db utilizes a unified Fields array that encompasses every possible attribute for all declared entities. This introduces a strict requirement for null placeholders:

Fixed Row Length: Every record in the Values array must have the exact same number of elements as the master Fields array.

Explicit Nulls: If a field is defined for "Entity A" but the current record is "Entity B," the value for that field must be null. This ensures that the parser always knows exactly which index corresponds to which field, regardless of the record type.

This rigid adherence to structure allows 104db to provide the power of a relational database with the parsing speed of a flat positional file, making it the premier choice for complex automation orchestration and state-heavy configuration management.