1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
|
# Modeling utils
These utilities are used to model schemas for data stored in a python dictionary or YAML and JSON format.
The utilities also take care of parsing, validating and creating JSON schemas and basic documentation.
## Creating schema
Schema is created using `ConfigSchema` class. Schema structure is specified using annotations.
```python
from .modeling import ConfigSchema
class SimpleSchema(ConfigSchema):
integer: int = 5 # a default value can be specified
string: str
boolean: bool
```
Even more complex types can be used in a schema. Schemas can be also nested.
Words in multi-word names are separated by underscore `_` (e.g. `simple_schema`).
```python
from typing import Dict, List, Optional, Union
class ComplexSchema(ConfigSchema):
optional: Optional[str] # this field is optional
union: Union[int, str] # integer and string are both valid
list: List[int] # list of integers
dictionary: Dict[str, bool] = {"key": False}
simple_schema: SimpleSchema # nested schema
```
### Additianal validation
If a some additional validation needs to be done, there is `_validate()` method for that.
`ValueError` exception should be raised in case of validation error.
```python
class FieldsSchema(ConfigSchema):
field1: int
field2: int
def _validate(self) -> None:
if self.field1 > self.field2:
raise ValueError("field1 is bigger than field2")
```
### Additional layer, transformation methods
It is possible to add layers to schema and use a transformation method between layers to process the value.
Transformation method must be named based on field (`value` in this example) with `_` underscore prefix.
In this example, the `Layer2Schema` is structure for input data and `Layer1Schema` is for result data.
```python
class Layer1Schema(ConfigSchema):
class Layer2Schema(ConfigSchema):
value: Union[str, int]
_LAYER = Layer2Schema
value: int
def _value(self, obj: Layer2Schema) -> Any:
if isinstance(str, obj.value):
return len(obj.value) # transform str values to int; this is just example
return obj.value
```
### Documentation and JSON schema
Created schema can be documented using simple docstring. Json schema is created by calling `json_schema()` method on schema class. JSON schema includes description from docstring, defaults, etc.
```python
SimpleSchema(ConfigSchema):
"""
This is description for SimpleSchema itself.
---
integer: description for integer field
string: description for string field
boolean: description for boolean field
"""
integer: int = 5
string: str
boolean: bool
json_schema = SimpleSchema.json_schema()
```
## Creating custom type
Custom types can be made by extending `BaseValueType` class which is integrated to parsing and validating process.
Use `DataValidationError` to rase exception during validation. `object_path` is used to track node in more complex/nested schemas and create useful logging message.
```python
from .modeling import BaseValueType
from .modeling.exceptions import DataValidationError
class IntNonNegative(BaseValueType):
def __init__(self, source_value: Any, object_path: str = "/") -> None:
super().__init__(source_value)
if isinstance(source_value, int) and not isinstance(source_value, bool):
if source_value < 0:
raise DataValidationError(f"value {source_value} is negative number.", object_path)
self._value = source_value
else:
raise DataValidationError(
f"expected integer, got '{type(source_value)}'",
object_path,
)
```
For JSON schema you should implement `json_schema` method.
It should return [JSON schema representation](https://json-schema.org/understanding-json-schema/index.html) of the custom type.
```python
@classmethod
def json_schema(cls: Type["IntNonNegative"]) -> Dict[Any, Any]:
return {"type": "integer", "minimum": 0}
```
## Parsing JSON/YAML
For example, YAML data for `ComplexSchema` can look like this.
Words in multi-word names are separated by hyphen `-` (e.g. `simple-schema`).
```yaml
# data.yaml
union: here could also be a number
list: [1,2,3,]
dictionary:
key": false
simple-schema:
integer: 55
string: this is string
boolean: false
```
To parse data from YAML format just use `parse_yaml` function or `parse_json` for JSON format.
Parsed data are stored in a dict-like object that takes care of `-`/`_` conversion.
```python
from .modeling import parse_yaml
# read data from file
with open("data.yaml") as f:
str_data = f.read()
dict_data = parse_yaml(str_data)
validated_data = ComplexSchema(dict_data)
```
|