No.
|
Packages & Interfaces
|
Issue |
Changes |
Comment |
1
|
javax.datamining.base. BuildSettings
|
public void setOutlierTreatment (String logicalAttrName,
OutlierTreatment treatment) public void
setOutlierIdentification(String logicalAttrName, Interval bounds)
Methods descriptions are inconsistent, noting name existence is
verified by the verify method and that they throw exception if the
attribute does not exist.
|
- setOutlierTreatment does not throw JDMException if the
attribute does not exist.
- setOutlierIdentification does not throw JDMException if the
attribute does not exist.
|
|
2
|
javax.datamining.base. BuildSettings
|
With BuildSettings,
attributes can be specified with usage, weight, outlier treatment and
outlier identification. However, the API does not provide means to
retrieve these attributes. This is needed when dealing with a restored
BuildSettings.
|
- New enumeration - AttributeRetrievalType {usage, weight,
outlierTreatment, outlierIdentification}
- New method - public String[]
getAttributeNames(AttributeRetrievalType)
|
|
3
|
javax.datamining.resource. Connection
|
There are methods that
return named objects, but not only the names of those objects. Returning
the object names is important for efficiency is displaying objects in a
GUI.
|
New methods added to
Connection:
- public Collection getObjectNames(Date, Date,
NamedObject)
- public Collection getObjectNames(Date, Date, NamedObject,
Enum)
- public Collection getModelNames(MiningFunction,
MiningAlgorithm, Date, Date)
Note: getModelNames
- function cannot be null. |
Consideration for JDM 2.0 is a more
general and powerful ObjectFilter-based interface. This requires a
proposal with use cases, and agreement from vendors / users on need /
uptake.
|
4
|
javax.datamining.resource. Connection
|
Allow users to explicitly inform
the DME to load data as an optimization hint. This is analogous to models
that can be loaded into memory upon requests by the user. One use case is
where there are different logical data and algorithms to build different
model on the same data. If not supported by the DME, the methods must
be a no-op, as for models.
|
New methods:
- public String[] getLoadedData() - Returns an array of data
URIs that are currently loaded. The result is an empty array if data
loading is not supported.
- public void requestDataLoad(String) - Requests the DME to
load the specified data in memory to enhance efficiency and performance.
The intent is for the data to remain in memory until
requestUnloadData is invoked for the same data,
or the connection terminates and there are no connections using the
data. This may be a no-op if the vendor need not load data into
memory or does not support the capability. It is an idempotent operation
if the data has not changed. This method can be invoked on multiple
data. If the specified data does not exist or cannot be located, an
exception is thrown.
- public void requestDataUnloaded(String) - Informs the DME
that the specified data is no longer needed and that the data may be
removed from memory if necessary.
This may be a no-op if vendor does
not require loading data into memory. It is an idempotent
operation. If the requested data does not exist or cannot be located,
an exception is thrown. |
|
5
|
javax.datamining. modeldetail.tree. TreeModelDetail
|
Tree model details
were omitted from the API. Add new methods for more information about the
decision tree model: the number of nodes, the number of leaf nodes, and
the tree depth.
|
New
methods:
- public int getTreeDepth()
- public int getNumberOfNodes()
- public int getNumberOfLeafNodes()
|
|
6
|
javax.datamining.modeldetail.tree. TreeNode
|
TreeModelDetail has getRules()
and getRule(int nodeId) methods, but not an ability to get the rule on a
TreeNode. Such functionality is present in Clustering which
also supports rules. |
New method:
- public Rule getRule() - Javadoc: Returns the rule associated
with the node. Any node in the tree can return its associated rule.
|
|
7
|
javax.datamining.algorithm.tree. TreeSettings
|
Currently only one kind of
minimum node size is allowed and precludes vendors from accepting two
kinds (count and percent).
|
Deprecated methods:
- public double getMinNodeSize()
- public SizeUnit getMinNodeSizeUnit()
Note: These two methods return the value last
set.
New method in TreeSettings:
- public double getMinNodeSize(SizeUnit)
Note: the semantics when both count and percent are
specified is that node split does not happen when either criterion is
satisfied.
New method in
TreeSettingsFactory:
- public boolean
supportsMinNodeSizeUnit(SizeUnit)
|
|
8
|
javax.datamining. modeldeatil.tree. TreeModelDeatail clustering. ClusteringModel
|
Clarify the ranges of the values
for tree depth, level, and number of clusters returned from the models
|
Javadoc change: Explicitly
specify that all hierarchies start from level 0: Tree, clustering,
taxonomy Tree depth > 0 Number of nodes in the tree > 1
Number of leaf nodes in the tree > 1 Number of clusters > 0
|
|
9
|
javax.datamining.clustering. ClusteringSettingsFactory
|
Aggregation function
and attribute comparison function need to be coupled since each since they
are not independent. but there's no capability that takes both
parameters.
|
Deprecated
methods
- public boolean supportsCapability(AggregationFunction)
- public boolean
supportsCapability(AttributeComparisonFunction)
New Method
- public boolean supportsCapability(AggregationFunction,
AttributeComparisonFunction)
Note: Both arguments
for the new method must be non-null. Either or both could be systemDefault
or systemDetermined. |
|
10
|
javax.datamining.clustering. ClusteringApplySettings
|
Unify descriptions for create
methods on apply settings:
- ClassificationApplySettings.create() - Creates an instance of
ClassificationApplySettings initialized to vendor-specific default
values
- RegressionApplySettings.create() - Creates an instance of
RegressionApplySettings initialized to vendor-specific default values
- ClusteringApplySettings.create() -
Creates an empty
instance of ClusteringApplySettings |
ClusteringApplySettingsFactory.create() - javadoc
changed as following: Creates an instance of
ClusteringApplySettings initialized to
vendor-specific default values.
|
|
11
|
javax.datamining.clustering. ClusteringApplyCapability
|
Correct Javadoc which uses
ClusteringApplyContentCapability, instead of
ClusteringApplyCapability. |
Javadoc changed to use
ClusteringApplyCapability. |
|
12
|
javax.datamining. supervised.classification. ClassificationApplySettings clustering. ClusteringApplySettings
|
Once an array of destination
attribute names are mapped with an apply content by mapByRank method,
there is no way to get such attribute names for inspection. This is the
same in ClusteringApplySetting.
|
The following methods have been
added: ClassificationApplySettings
- public String[]
getMappedDestinationAttributeNames(ClassificationApplyContent)
ClusteringApplySettings
- public String[]
getMappedDestinationAttributeNames(ClusteringApplyContent)
|
|
13
|
javax.datamining. clustering. ClusteringApplySettings supervised.classification. ClassificationApplySettings |
Clarify in Javadoc that the cardinality of the
destination attribute names specified with the mapByRank method must be
the same across different invocations (with different apply
contents). Clarify also the effect of invoking
mapByCategory and mapTopPrediction methods with the apply content. State clearly that
the map methods cannot be used together, e.g., mapByRank and
mapPredictions cannot be used with the same apply settings. |
Make changes to the Javadoc accordingly, including
the following descriptions at interface level for ClusteringApplySettings
and ClassificationApplySettings:
- The map methods in this interface are to be used mutually
exclusively. If a different kind of mapping is needed, then the previous
settings must be reset by
resetMapping
method.
ClassificationApplySettings
- mapByRank - If this method is invoked on the same
content , the previous mapping is replaced with the new
one. The cardinality of the destination attribute names specified
with this method must be the same across multiple invocations with
different apply contents. If the cardinality is different for an apply
content that has already been specified previously, then all previous
settings become nullified and the current invocation creates a new
setting.
- MapByCategory - If this method is invoked on the same pair of
apply content and cluster identifier, then the previous setting is
replaced with the new one.
- mapPredictions - If the same
content
is used, the previous mapping is replaced with the new one.
- mapTopPrediction - If this method is invoked on the same
apply content, then the previous setting is replaced with the new one.
ClusteringApplySettings
- mapByClusterIdentifier - If this method is invoked on the
same pair of apply content and cluster identifier, then the previous
setting is replaced with the new one.
- mapByRank - same as ClassificationApplySettings.mapByRank
- mapClusters - same as
ClassificationApplySettings.mapPredictions
- mapTopCluster - same as
ClassificationApplySettings.mapTopPrediction
|
|
14
|
javax.datamining.base. Task
|
Each Task subtype has a verify
method. Produce a cleaner design by moving method to Task. Each child
interface must still implement the method. The Javadoc of each child
interface must describe how verification can be done, and that
verification is vendor specific.
|
Methods moved Task:
- public VerificationReport BuildTask.verify() :
- public VerificationReport ImportTask.verify()
- public VerificationReport ExportTask.verify()
- public VerificationReport
ComputeStatisticsTask.verify()
- public VerificationReport ApplyTask.verify()
- public VerificationReport
ClassificationTestTask.verify()
- public VerificationReport
ClassificationTestMetricsTask.verify()
|
|
15
|
javax.datamining.task.apply. ApplyTask
|
The Javadoc for verify() says at
the end:
On execute, if a signature attribute does not have a mapped
input, an exception is raised if synchronous, or a error status if
asynchronous.
This means that an exception is thrown if an attribute is
missing in the apply data. But it should depends on implementation. For
one, record apply may contain only a partial set of attributes in the
record. This should apply to data set apply as well. |
Javadoc change:
ApplyTask
description augmented with:
If a signature attribute does not have
a mapped attribute in the input data, it is the vendor's choice to regard
it as a missing value and continue the apply operation. An exception may
also be thrown if the vendor does not support such a feature. |
Related to change #13
|
16
|
javax.datamining. supervised.classification. ClassificationTestTask
|
<?> omits a
method to set the description for the test metrics object it creates
because test metrics is a named object.
|
New methods
added:
- public String getTestMetricsDescription()
- public void setTestMetricsDescription(String description)
|
|
17
|
javax.datamining.algorithm.svm. classification. SVMClassificationSettingsFactory regression. SVMRegressionSettingsFactory
|
SVM classification and regression
need supportsCapability method to check which kernel functions are
supported by the implementation. |
New
Methods:
SVMClassificationSettingsFactory
- public boolean supportsCapability(KernelFunction)
SVMRegressionSettingsFactory
- public boolean supportsCapability(KernelFunction)
Note: The argument kernelFunction cannot be null.
|
|
18
|
javax.datamining.algorithm. svm.classification. SVMClassificationSettings svm.regression. SVMRegressionSettings
|
The following parameters for SVM
should not allow 0: Complexity factor, Tolerance,
Epsilon. For example, accepting 0 for tolerance precludes
convergence.
|
Javadoc
changed:
SVMClassificationSetting
- setComplexityFactor - The factor must be a positive number.
- setTolerance - The value must be greater than 0 and less than
1.
SVMRegressionSetting
- setComplexityFactor - The factor must be a positive number.
- setTolerance - The value must be greater than 0 and less than
1.
- setEpsilon - The value must be a positive number that is less
than 1.
|
|
19
|
javax.datamining. modeldetail.svm. SVMClassificationModelDeatil SVMRegressionModelDeatil modeldetail.naivebayes. NaiveBayesModelDetail
|
It is difficult to get
coefficients or target probabilities when logical data is absent with the
model because these methods require knowledge of data by taking attribute
values as arguments.
|
New
methods:
SVMClassificationModelDetail
- public java.util.Map getCoefficients(Object targetValue,
String attrName) - returns a Map of pairs of attribute value and its
coefficient associated with the specified target
value
SVMRegressionModelDetail
- public java.util.Map getCoefficients(String attrName) -
returns a Map of pairs of attribute value and its coefficient
NaiveBayesModelDetail
- public java.util.Map getPairProbabilties(attrName : String,
targetValue : Object)
|
|
20
|
javax.datamining.association. AssociationSettings AssociationModel
|
Unify the range
specification of support and confidence. Some use [0..1] and some use
[0..100]. In addition, both boundary values must be
allowed.
|
Javadoc descriptions
have been changed to use [0..100] in
interfaces:
AssociationModel
- getMaxConfidence
- getMinConfidence
AssociationSettings
- setMinConfidence
- setMinSupport
|
|
21 |
javax.datamining.association. AssociationRule |
Rule ID is necessary in preparation of apply with AR
to provide the rule ID associated with the prediction. |
New method:
- public in getRuleIdentifier()
|
|
22 |
javax.datamining.data. CategoryMatrix |
public Double getValue(Object rowCategoryValue,
Object columnCategoryValue) throws JDMException
This method
returns Double, but its behavior is not clear when a non-existing entry is
specified. If it returns null for such entries, then it would not be able
to support sparse representation of matrices. Also, it looks as though
exception does not need to be thrown because it returns null for
non-existing entries.
Need to introduce a method with a new name,
such as getCellValue, since method overloading is not possible. Then, a
new set method also needs to be introduced for completeness.
It is
also noted that CategoryMatrix is a common super interface of three other
interfaces, but they share little in common; they are tied together as a
CategoryMatrix simply because they bear a name that includes
Matrix. CategoryMatrix is deprecated. |
CategoryMatrix: deprecated (along
with getValue method)
New methods in
SimilarityMatrix
- public double getCellValue(Object category1, Object
category2)
- public void setCellValue(Object category1, Object category2,
double similarityValue)
New methods in
CostMatrix:
- public double getCellValue(Object actualTarget, Object
predictedTarget)
- public double setCellValue(Object actualTarget, Object
predictedTarget, double cost)
Note: The new get
methods now return the default value depending on the type of the
interface if the entry does not exist, and do not throw exception. For
example, CostMatrix getCellValue returns 0 for diagonal entries, even if
they are not specified. The new set methods also do not throw exception
because they return null if the cell is not found.
Note:
ConfusionMatrix already has a method getNumberOfPredictions
that is equivalent of getCellValue.
|
From JDM 2.0, the two remaining methods in
CategoryMatrix will be moved down to CostMatrix, ConfusionMatrix and
SimilarityMatrix and CategoryMatrix will
be removed (deprecated) entirely.
|
23
|
javax.datamining.data. PhysicalDataSet |
The API does not provide a means to inspect the
physical attributes based on data type or role. For example, if a physical
data is created by metadata import, attributes with unsupported data type
will be marked as unknown. |
New methods:
- getAttributeNames(AttributeDataType dataType) : Collection
- getAttributeNames(PhysicalAttributeRole role) : Collection
Note: These methods return a collection of
attribute names.
|
|