- Various refactoring by m-muecke
- HDBSCAN gained parameter cluster_selection_epsilon to implement clusters selected from Malzer and Baum (2020).
- Functions ncluster() and nnoise() were added.
- hullplot now() marks noise as x.
- Added clplot().
- pointdensity now also accepts a dist object as input and has the new type "gaussian" to calculate a Gaussian kernel estimate.
- Added the DBCV index.
- extractFOCS: Fixed total_score.
- dbscan has now tidymodels tidiers (glance, tidy, augment).
- kNNdistplot can now plot a range of k/minPts values.
- added stats::nobs methods for the clusterings.
- kNN and frNN now contains the used distance metric.
- dbscan component dist was renamed to metric.
- Removed redundant sort in kNNdistplot (reported by Natasza Szczypien).
- Refactoring use anyNA(x) instead of any(is.na(x)) and many more (by m-muecke).
- Reorganized the C++ source code.
- README now uses bibtex.
- Tests use now testthat edition 3 (m-muecke).
- point_density checks now for missing values (reported by soelderer).
- Removed C++11 specification.
- ANN.cpp: fixed Rprintf warning.
- kNNdistplot gained parameter minPts.
- dbscan now retains information on distance method and border points.
- HDBSCAN now supports long vectors to work with larger distance matrices.
- conversion from dist to kNN and frNN is now more memory efficient. It does no longer coerce the dist object into a matrix of double the size, but extract the distances directly from the dist object.
- Better description of how predict uses only Euclidean distances and more error checking.
- The package now exports a new generic for as.dendrogram().
- is.corepoint() now uses the correct epsilon value (reported by Eng Aun).
- functions now check for cluster::dissimilariy objects which have class dist but missing attributes.
- is.corepoint() for DBSCAN.
- coredist() and mrdist() for HDBSCAN.
- find connected components with comps().
- reachability plot now shows all undefined distances as a dashed line.
- memory leak in mrd calculation fixed.
- We use now roxygen2.
- Added predict for hdbscan (as suggested by moredatapls)
- LOF: fixed numerical issues with k-nearest neighbor distance on Solaris.
- Fixed description of k in knndistplot and added minPts argument.
- Fixed bug for tied distances in lof (reported by sverchkov).
- lof: the density parameter was changes to minPts to be consistent with the original paper and dbscan. Note that minPts = k + 1.
- Improved speed of LOF for large ks (following suggestions by eduardokapp).
- kNN: results is now not sorted again for kd-tree queries which is much faster (by a factor of 10).
- ANN library: annclose() is now only called once when the package is unloaded. This is in preparation to support persistent kd-trees using external pointers.
- hdbscan lost parameter xdist.
- removed dependence on methods.
- fixed problem in hullplot for singleton clusters (reported by Fernando Archuby).
- GLOSH now also accepts data.frames.
- GLOSH returns now 0 instead of NaN if we have k duplicate points in the data.
- kNN and frNN gained parameter query to query neighbors for points not in the data.
- sNN gained parameter jp to decide if the shared NN should be counted using the definition by Jarvis and Patrick.
- kNNdist gained parameter all to indicate if a matrix with the distance to all nearest neighbors up to k should be returned.
- kNNdist now correctly returns the distances to the kth neighbor (reported by zschuster).
- dbscan: check eps and minPts parameters to avoid undefined results (reported by ArthurPERE).
- pointdensity was double counting the query point (reported by Marius Hofert).
- OPTICS now calculates eps if it is omitted.
- Example now only uses igraph conditionally since it is unavailable on Solaris (reported by B. Ripley).
- Fixed problem with constant name on Solaris in ANN code (reported by B. Ripley).
- HDBSCAN was added.
- extractFOSC (optimal selection of clusters for HDBSCAN) was added.
- GLOSH outlier score was added.
- hullplot uses now filled polygons as the default.
- hullplot now used PCA if the data has more than 2 dimensions.
- Added NN superclass for kNN and frNN with plot and with adjacencylist().
- Added shared nearest neighbor clustering as sNNclust() and sNN to calculate the number of shared nearest neighbors.
- Added pointdensity function.
- Unsorted kNN and frNN can now be sorted using sort().
- kNN and frNN now also accept kNN and frNN objects, respectively. This can be used to create a new kNN (frNN) with a reduced k or eps.
- Datasets added: DS3 and moon.
- Improved interface for dbscan() and optics(): ... it now passed on to frNN.
- OPTICS clustering extraction methods are now called extractDBSCAN and extractXi.
- kNN and frNN are now objects with a print function.
- dbscan now also accepts a frNN object as input.
- jpclust and sNNclust now return a list instead of just the cluster assignments.
- The package has now a vignette.
- Jarvis-Patrick clustering is now available as jpclust().
- Improved interface for dbscan() and optics(): ... is now passed on to frNN.
- OPTICS clustering extraction methods are now called extractDBSCAN and extractXi.
- hullplot uses now filled polygons as the default.
- hullplot now used PCA if the data has more than 2 dimensions.
- kNN and frNN are now objects with a print function.
- dbscan now also accepts a frNN object as input.
- Added hullplot to plot a scatter plot with added convex cluster hulls.
- OPTICS: added a predecessor correction step that is used by the ELKI implementation (Matt Piekenbrock).
- Fixed a memory problem in frNN (reported by Yilei He).
- OPTICSXi is now implemented (thanks to Matt Piekenbrock).
- DBSCAN now also accepts MinPts (with a capital M) to be compatible with the fpc version.
- DBSCAN objects are now also of class db scan_fast to avoid clashes with fpc.
- DBSCAN and OPTICS have now predict functions.
- Added test for unhandled NAs.
- Fixed LOF for more than k duplicate points (reported by Samneet Singh).
- OPTICS: fixed second bug reported by Di Pang
- all methods now also accept dist objects and have a search method "dist" which precomputes distances.
- OPTICS: fixed bug with first observation reported by Di Pang
- OPTICS: clusterings can now be extracted using optics_cut
- added tests (testthat).
- input data is now checked if it can safely be coerced into a numeric matrix (storage.mode double).
- fixed self matches in kNN and frNN (now returns the first NN correctly).
- Added weights to DBSCAN.
- Added kNN interface.
- Added frNN (fixed radius NN) interface.
- Added LOF.
- Added OPTICS.
- All algorithms check now for interrupt (CTRL-C/Esc).
- DBSCAN now returns a list instead of a numeric vector.
- DBSCAN: Improved speed by avoiding repeated sorting of point ids.
- Added linear NN search option.
- Added fast calculation for kNN distance.
- fpc and microbenchmark are now used conditionally in the examples.
- initial release