Categories
News

Fast and reliable programming language Bosque

Microsoft recently introduced a new programming language Bosque, which references the syntax and types of TypeScript, as well as the semantics of ML and Node/JavaScript. Author Microsoft computer scientist Mark Marron is  dedicated to eliminating the complexities that arise during programming and creating the language Bosque that he believes goes beyond mainstream structured programming.

Structured programming is now available everywhere. Whether you’re using C/C++ or a programming language like Java, Python, or Golang, this programming idea is basically used in the development process. It was originally designed to replace the program. The disadvantage is greater than the profitable goto grammar. Researchers used looping, ordering, and selection in structured programming ideas, and eventually completely replaced the goto instructions, and such programming ideas have been popular ever since.

Bosque is based on Mark’s paper ” Regularized Programming with the BOSQUE Language “. In the paper, the author points out that the structured programming and abstract data types that emerged in the 1970s allowed developers to mask the features of the underlying hardware architecture. Focus on writing functional code, and development has become less error-prone. On the basis of this, the author proposes a new programming idea–Regularized Programming, which avoids the structure by avoiding iterative processing of low-level looping operations and enriching languages ​​with algebraic data conversion operators. Programming.

The author also designed a new programming language, Bosque, for this idea. Specifically, existing programming has been simplified, becoming a standardized form, eliminating the main source of uncertainty. Based on a series of analysis, experience and validation of runtime and programmer development, and interviews with developers, the paper identifies five major sources of uncertainty in the development process:

  • Variable state and logical frames : Introducing variability into programming languages ​​undermines the ability to infer programs in a monotonous manner, forcing programmers (and any analysis tools) to determine what is still valid after the operation and which has failed. . At the same time, variable code return values ​​and side effects on parameters (or other global states) affect program state, which also negates the need to infer logical frames for each operation.
  • Loops, recursion, and invariants : looping and recursion are the most basic challenges of reasoning, because code describes the effects of a single step, but understanding the complete construct requires generalizing the quantized properties of a set of values, and the invariants provide the required joins. However, in general, such general computing techniques are not achievable.
  • Uncertain behavior : Uncertain behavior includes undefined, specified or undetermined environmental behavior, which requires a programmer or analytical tool to reason and explain all possible outcomes. For example: sorting stability, map/dictionary enumeration order, etc. These uncertain behaviors increase the complexity of the development process and are slowly seen as technical debts that should be removed over time.
  • Does not follow “data invariant” : programming languages ​​typically provide access and update operators for individual elements in an array/tuple or fields in an object/record, which are executed on a per-element basis, resulting in a program The member updates the state of the object in multiple steps, and the invariants normally held are temporarily invalidated before recovery. In these cases, the amount of detail that must be tracked and recovered greatly increases the likelihood of errors occurring.
  • Equivalence and aliases : Programming languages ​​are at the boundaries of mathematics and engineering. Although language semantics are expressed as mathematical concepts, there are some common situations, such as: reference equality, by value, by reference or evaluation order, which is actually the default underlying. It is the von Neumann architecture. Although seemingly insignificant, these choices have a major impact on comprehensibility, such as the reference equality leads to the complexity of alias relationship reasoning, and the compilation of other architectures becomes very complicated.

 

These uncertainties are the source of various bugs in the program, increasing the complexity of the developer’s understanding and implementation of the application’s functionality, while making the program’s automatic reasoning very complex or completely infeasible.

Among them, according to the interview with Mark by the technology media The Register , Mark believes that the problem of variable state , loop and reference equality is the most prominent.

Taking reference equality as an example, Mark points out that when two variables point to the same object in memory, the complexity of the problem increases. “It looks very simple, but once you have reference equality in the semantics, you must constantly Consider the relationship between it and the pointer alias it introduces.”

The most familiar looping mechanism also brings a lot of complexity. It was canceled in Bosque. Below is an example equivalent to the for loop in JavaScript:

//Functor (Bosque) 

Var a = List[Int]@{...}; 
//Pre: true 

Var b = a.map[Int](fn(x) => x*2); 
//Post: List[Int]::eq(fn(x, y) => y == x*2, a, b)

Bosque is derived from the concept of standardized programming, in order to solve the problems encountered in the current structured programming, the author thinks that the rise of structured programming is the first golden age of programmers and development tools, he believes this This standardized programming model will greatly improve the productivity of developers, improve the quality of software, and bring the second golden age of compilers and development tools.

Grammar example:

Add two numbers:

function add2(x: Int, y: Int): Int {
    return x + y;
}

add2(2, 3) //5

Odd detection using the rest parameter and lambda:

function allOdd(...args: List[Int]): Bool {
    return args.all(fn(x) => x % 2 == 1);
}

allOdd(1, 3, 4) //false

Batch update properties on Record

function update(point: {x: Int, y: Int, z: Int}, value: Int): {x: Int, y: Int, z: Int} {
    return point<~(y=value, x=-point.x);
}

update(@{x=1, y=2, z=3}, 5) //@{x=-1, y=5, z=3}

Inaccessible parameters:

function tryGetProperty(r?: {f: Int, k: Int}): Int? {
    return r?.f;
}

Sign:

function sign(x?: Int): Int {
    var! y;

    if(x == none || x == 0) {
        y = 0;
    }
    else {
        y = (x > 0) ? 1 : -1;
    }

    return y;
}
See the paper and source code for details: