DataLinter.versionMethod
version()

Returns the current DataLinter version using the Project.toml and git. If the Project.toml, git are not available, the version defaults to an empty string.

source
DataLinter.LinterCore.applicableMethod

Function that checks whether a linter is applicable or not. The logic is that the iterable type must match and if linter.linting_ctx==true then a linting context must exist, either specified in the config, through the presence of code or both. For code-only linters, the linting context is ignored as code queries are allowed to fail or be missing.

source
DataLinter.LinterCore.lintMethod
lint(data_ctx::AbstractContext, kb::Union{Nothing, AbstractKnowledgeBase}; config=nothing, debug=false, linters=["all"])

Main linting function. Lints the data provided by data_ctx using knowledge from kb. A configuration for the available linters can be provided in config. If debug=true, performance information for each linter are shown. By default, all available linters will be used.

source
DataLinter.LinterCore.reconcile_contextsMethod
reconcile_contexts(code_ctx, config_ctx)

Function that reconciles contexts obtained from code and configuration .toml file. The basic approach is to take all available data from code_ctx and when not available fill in from config_ctx.

source
DataLinter.LinterCore.load_configMethod
load_config(configpath::AbstractString)

Loads a linting configuration file located at configpath. The configuration file contains options regarding which linters are enabled and linter parameter values.

Examples

julia> using DataLinter
       using Pkg
        configpath = joinpath(dirname((Pkg.project()).path), "config", "default.toml")
       DataLinter.LinterCore.load_config(configpath)
Dict{String, Any} with 2 entries:
  "parameters" => Dict{String, Any}("uncommon_signs"=>Dict{String, Any}(), "enum_detector"=>Dict{String, Any}("distinct_max_limit"=>5, "distinct_ratio"=>0.001), "empty_example"=>Dict{String, Any}(), "negative_…
  "linters"    => Dict{String, Any}("uncommon_signs"=>true, "enum_detector"=>true, "empty_example"=>true, "negative_values"=>true, "tokenizable_string"=>true, "number_as_string"=>true, "int_as_float"=>true, "l…
source
DataLinter.DataInterface.build_data_contextMethod
build_data_context(;data=nothing, code=nothing)

Builds a data context object using data and code if available. The data context represents a context in which the linter runs: the data it lints and optionally, the code associated to the data i.e. some algorithm that will be applied on that data.

Examples

julia> using DataLinter
       ctx = DataLinter.build_data_context("./test/data/data.arrow")
DataContext{Arrow.Table} (0.11153507232666016 MB of data)

julia> kb = DataLinter.kb_load("");
       DataLinter.LinterCore.lint(ctx, kb)  # linters disabled
Pair{Tuple{DataLinter.LinterCore.Linter, String}, DataLinter.LinterCore.AbstractCheck}[]
Pair{Tuple{DataLinter.LinterCore.Linter, String}, DataLinter.LinterCore.AbstractCheck}[]


julia> config = DataLinter.LinterCore.load_config("./test/test_config.toml");
       DataLinter.LinterCore.lint(ctx, kb; config)  # linters enabled
120-element Vector{Pair{Tuple{DataLinter.LinterCore.Linter, String}, DataLinter.LinterCore.AbstractCheck}}:
                         (Linter (name=datetime_as_string, f=is_datetime_as_string), "column: x2") => DataLinter.LinterCore.NotAvailableCheck(nothing)
                         (Linter (name=datetime_as_string, f=is_datetime_as_string), "column: x5") => DataLinter.LinterCore.NotAvailableCheck(nothing)
                         (Linter (name=datetime_as_string, f=is_datetime_as_string), "column: x6") => DataLinter.LinterCore.PassedCheck(nothing)
                         ...
source