Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is "syntactically correct by construction" worth it? #18

Closed
LightAndLight opened this issue Jan 30, 2018 · 0 comments
Closed

Is "syntactically correct by construction" worth it? #18

LightAndLight opened this issue Jan 30, 2018 · 0 comments

Comments

@LightAndLight
Copy link
Contributor

LightAndLight commented Jan 30, 2018

Until now, the approach to the AST has been "as correct-by-construction as possible, falling back to smart constructors when necessary". Since we've been getting closer to a complete representation, I have been considering another approach that gets similar levels of safety but permits a more elegant library design, and potentially better user experience.

Here it is, applied to a very small AST:

{-# language DataKinds, PolyKinds, LambdaCase, ViewPatterns #-}
module AST (AST, Val(..), _Int, _Add, _Assign, unvalidate, validate) where

import Control.Lens
import Data.Coerce

data Val = UV | V

data AST (a :: Val)
  = Int Int
  | Add (AST a) (AST a)
  | Assign String (AST a)
  deriving (Eq, Show)

_Int :: Prism (AST a) (AST UV) Int Int
_Int =
  prism
    Int
    (\case; (unvalidate -> Int a) -> Right a; (unvalidate -> a) -> Left a)

_Add :: Prism (AST a) (AST UV) (AST UV, AST UV) (AST UV, AST UV)
_Add =
  prism
    (uncurry Add)
    (\case; (unvalidate -> Add a b) -> Right (a, b); (unvalidate -> a) -> Left a)

_Assign :: Prism (AST a) (AST UV) (String, AST UV) (String, AST UV)
_Assign =
  prism
    (uncurry Assign)
    (\case; (unvalidate -> Assign a b) -> Right (a, b); (unvalidate -> a) -> Left a)

unvalidate :: AST a -> AST UV
unvalidate = coerce

validate :: AST UV -> Maybe (AST V)
validate (Int a) = Just $ coerce $ Int a
validate (Add a b) =
  fmap coerce $
  Add <$> (coerce <$> validate a) <*> (coerce <$> validate b)
validate (Assign a b)
  | a == "bad" = Nothing
  | otherwise =
    fmap coerce $
    Assign a <$> (coerce <$> validate b)

In this approach, I use a phantom type to indicate that the AST has been validated. Due to the types of the prisms, only unvalidated terms can be constructed, but terms of any validation status can be matched on. This means we can use the same data structure to represent validated and unvalidated terms. The current codebase has two distinct datatypes (with a lot of duplication). The other consequence is that all syntax-correction checking is moved to run-time. I don't believe this is a bad thing, considering the amount of checks that are already performed at run-time.

This pattern also fixes the optics problem that is demonstrated in #17.

There is a small "safety" flaw with this approach, that a user can just use coerce to skip the validation stage. Currently I think that's an okay trade-off.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant