【文章內(nèi)容簡介】
general information about the PMML document, such as copyright formation for the model, its description, and information about the application used to generate the model such as name and version. 數(shù)據(jù)挖掘標(biāo)準(zhǔn)與 規(guī)范 PMML version= ... Header copyright=Copyright (c) 2023 Togaware description=RPart Decision Tree Extension name=timestamp value=20230215 06:51:50 extender=Rattle/ Extension name=description value=iris tree extender=Rattle/ Application name=Rattle/PMML version=/ /Header The data dictionary records information about the data ?elds from which the model was built. 數(shù)據(jù)挖掘標(biāo)準(zhǔn)與 規(guī)范 DataDictionary numberOfFields=5 DataField name=Species ... Value value=setosa/ Value value=versicolor/ Value value=virginica/ DataField name= optype=continuous dataType=double/ /DataField Data Transformations: transformations allow for the mapping of user data into a more desirable form to be used by the mining model. PMML defines several kinds of simple data transformations. ?Normalization: map values to numbers, the input can be continuous or discrete. ?Discretization: map continuous values to discrete values. ?Value mapping: map discrete values to discrete values. ?Functions (custom and builtin): derive a value by applying a function to one or more parameters. ?Aggregation: used to summarize or collect groups of values. 數(shù)據(jù)挖掘標(biāo)準(zhǔn)與 規(guī)范 Model: contains the definition of the data mining model. ?Model Name (attribute modelName) ?Algorithm Name (attribute algorithmName) ?Number of Layers (attribute numberOfLayers) Mining Schema: lists all fields used in the model. ?Name : must refer to a field in the data dictionary ?Usage type: defines the way a field is to be used in the model. Typical values are: active, predicted, and supplementary. Predicted fields are those whose values are predicted by the model. ?Outlier Treatment : defines the outlier treatment to be use. ?Missing Value Replacement Policy : if this attribute is specified then a missing value is automatically replaced by the given values. ?Missing Value Treatment : indicates how the missing value replacement was derived. 數(shù)據(jù)挖掘標(biāo)準(zhǔn)與 規(guī)范 Targets: allow for postprocessing of the predicted value in the format of scaling if the output of the model is continuous. 數(shù)據(jù)挖掘標(biāo)準(zhǔn)與 規(guī)范 PMML Example: Association Rule : 數(shù)據(jù)挖掘標(biāo)準(zhǔn)與 規(guī)范 ? t1: Cracker