Powered by OpenAIRE graph
Found an issue? Give us feedback
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ IRIS Cnrarrow_drop_down
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
https://zenodo.org/record/5970...
Part of book or chapter of book
License: CC BY
Data sources: UnpayWall
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Conference object . 2022
License: CC BY
Data sources: ZENODO
image/svg+xml Jakob Voss, based on art designer at PLoS, modified by Wikipedia users Nina and Beao Closed Access logo, derived from PLoS Open Access logo. This version with transparent background. http://commons.wikimedia.org/wiki/File:Closed_Access_logo_transparent.svg Jakob Voss, based on art designer at PLoS, modified by Wikipedia users Nina and Beao
CNR ExploRA
Conference object . 2021
Data sources: CNR ExploRA
image/svg+xml Jakob Voss, based on art designer at PLoS, modified by Wikipedia users Nina and Beao Closed Access logo, derived from PLoS Open Access logo. This version with transparent background. http://commons.wikimedia.org/wiki/File:Closed_Access_logo_transparent.svg Jakob Voss, based on art designer at PLoS, modified by Wikipedia users Nina and Beao
https://doi.org/10.1007/978-3-...
Part of book or chapter of book . 2021 . Peer-reviewed
License: Springer TDM
Data sources: Crossref
https://dx.doi.org/10.60692/eh...
Other literature type . 2021
Data sources: Datacite
https://dx.doi.org/10.60692/vy...
Other literature type . 2021
Data sources: Datacite
DBLP
Conference object . 2021
Data sources: DBLP
versions View all 7 versions
addClaim

A Co-occurrence Based Approach for Mining Overlapped Co-clusters in Binary Data

نهج قائم على التداخل لتعدين المجموعات المشتركة المتداخلة في البيانات الثنائية
Authors: Yuri Santa Rosa Nassar dos Santos; Rafael Santiago; Raffaele Perego; Matheus Henrique Schaly; Luis Otavio Alvares; Chiara Renso; Vania Bogorny;

A Co-occurrence Based Approach for Mining Overlapped Co-clusters in Binary Data

Abstract

Le co-clustering est un type spécifique de clustering qui aborde le problème du clustering simultané d'objets et d'attributs d'une matrice de données. Bien que les techniques générales de regroupement trouvent des co-grappes qui ne se chevauchent pas, la recherche de chevauchements possibles entre les co-grappes peut révéler des modèles intégrés dans les données que les grappes disjointes ne peuvent pas découvrir. Les approches de co-clustering qui se chevauchent proposées dans la littérature se concentrent sur la recherche de co-clusters mondiaux qui se chevauchent et elles pourraient négliger des modèles locaux intéressants qui ne sont pas nécessairement identifiés comme des co-clusters mondiaux. La découverte de tels co-clusters locaux augmente la granularité de l'analyse, et donc des modèles plus spécifiques peuvent être capturés. C'est l'objectif du présent article, qui propose la nouvelle méthode de co-clustering superposé (OCoClus) pour trouver des co-clusters superposés sur des données binaires, y compris des modèles globaux et locaux. Il s'agit d'une méthode non exhaustive basée sur la co-occurrence d'attributs et d'objets dans les données. Une autre nouveauté de cette méthode est qu'elle est pilotée par une fonction de coût objectif qui peut déterminer automatiquement le nombre de co-clusters. Nous évaluons l'approche proposée sur des ensembles de données accessibles au public, à la fois réelles et synthétiques, et comparons les résultats avec un certain nombre de données de référence. Notre approche montre de meilleurs résultats que les méthodes de base sur des données synthétiques et démontre son efficacité sur des données réelles.

La agrupación conjunta es un tipo específico de agrupación que aborda el problema de agrupar simultáneamente objetos y atributos de una matriz de datos. Aunque las técnicas generales de agrupamiento encuentran co-clusters no superpuestos, encontrar posibles superposiciones entre co-clusters puede revelar patrones incrustados en los datos que los clusters disjuntos no pueden descubrir. Los enfoques de co-agrupamiento superpuestos propuestos en la literatura se centran en encontrar co-agrupamientos superpuestos globales y podrían pasar por alto patrones locales interesantes que no necesariamente se identifican como co-agrupamientos globales. Descubrir tales co-clusters locales aumenta la granularidad del análisis y, por lo tanto, se pueden capturar patrones más específicos. Este es el objetivo del presente documento, que propone el nuevo método de agrupación superpuesta (OCoClus) para encontrar agrupaciones superpuestas en datos binarios, incluidos los patrones globales y locales. Este es un método no exhaustivo basado en la co-ocurrencia de atributos y objetos en los datos. Otra novedad de este método es que está impulsado por una función de coste objetivo que puede determinar automáticamente el número de co-clusters. Evaluamos el enfoque propuesto en conjuntos de datos disponibles públicamente, tanto datos reales como sintéticos, y comparamos los resultados con una serie de líneas de base. Nuestro enfoque muestra mejores resultados que los métodos de referencia en datos sintéticos y demuestra su eficacia en datos reales.

Co-clustering is a specific type of clustering that addresses the problem of simultaneously clustering objects and attributes of a data matrix. Although general clustering techniques find non-overlapping co-clusters, finding possible overlaps between co-clusters can reveal embedded patterns in the data that the disjoint clusters cannot discover. The overlapping co-clustering approaches proposed in the literature focus on finding global overlapped co-clusters and they might overlook interesting local patterns that are not necessarily identified as global co-clusters. Discovering such local co-clusters increases the granularity of the analysis, and therefore more specific patterns can be captured. This is the objective of the present paper, which proposes the new Overlapped Co-Clustering (OCoClus) method for finding overlapped co-clusters on binary data, including both global and local patterns. This is a non-exhaustive method based on the co-occurrence of attributes and objects in the data. Another novelty of this method is that it is driven by an objective cost function that can automatically determine the number of co-clusters. We evaluate the proposed approach on publicly available datasets, both real and synthetic data, and compare the results with a number of baselines. Our approach shows better results than the baseline methods on synthetic data and demonstrates its efficacy in real data.

التجميع العنقودي المشترك هو نوع محدد من التجميع العنقودي يعالج مشكلة تجميع كائنات وسمات مصفوفة البيانات في وقت واحد. على الرغم من أن تقنيات التجميع العام تجد مجموعات مشتركة غير متداخلة، إلا أن العثور على تداخلات محتملة بين المجموعات المشتركة يمكن أن يكشف عن أنماط مضمنة في البيانات التي لا يمكن للمجموعات المنفصلة اكتشافها. تركز مناهج التجميع المشترك المتداخلة المقترحة في الأدبيات على إيجاد مجموعات مشتركة عالمية متداخلة وقد تتجاهل الأنماط المحلية المثيرة للاهتمام التي لا يتم تحديدها بالضرورة على أنها مجموعات مشتركة عالمية. يزيد اكتشاف مثل هذه المجموعات المحلية من دقة التحليل، وبالتالي يمكن التقاط أنماط أكثر تحديدًا. هذا هو الهدف من هذه الورقة، التي تقترح طريقة التجميع المشترك المتداخلة الجديدة (OCoClus) لإيجاد مجموعات مشتركة متداخلة على البيانات الثنائية، بما في ذلك الأنماط العالمية والمحلية. هذه طريقة غير شاملة تعتمد على التواجد المشترك للسمات والكائنات في البيانات. ومن المستجدات الأخرى لهذه الطريقة أنها مدفوعة بدالة تكلفة موضوعية يمكنها تحديد عدد المجموعات المشتركة تلقائيًا. نقوم بتقييم النهج المقترح بشأن مجموعات البيانات المتاحة للجمهور، سواء البيانات الحقيقية أو الاصطناعية، ومقارنة النتائج مع عدد من خطوط الأساس. يُظهر نهجنا نتائج أفضل من الطرق الأساسية للبيانات التركيبية ويوضح فعاليتها في البيانات الحقيقية.

Country
Italy
Keywords

Artificial intelligence, Disjoint sets, Pattern recognition (psychology), Clustering, Trajectories, Co-clustering data mining, Clustering Algorithms, Cluster analysis, Artificial Intelligence, Biochemistry, Genetics and Molecular Biology, Document Clustering, Microarray Data Analysis and Gene Expression Profiling, FOS: Mathematics, Binary data, Novelty detection, CURE data clustering algorithm, Molecular Biology, Data mining, Granularity, Data Clustering Techniques and Algorithms, Arithmetic, Fuzzy clustering, Single-linkage clustering, Biclustering, Life Sciences, Novelty, Statistical and Nonlinear Physics, Semi-supervised Clustering, Computer science, FOS: Philosophy, ethics and religion, Philosophy, Operating system, Physics and Astronomy, Combinatorics, Computer Science, Physical Sciences, Statistical Mechanics of Complex Networks, Theology, Binary number, Density-based Clustering, Mathematics

  • BIP!
    Impact byBIP!
    selected citations
    These citations are derived from selected sources.
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    0
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Average
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Average
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Average
    OpenAIRE UsageCounts
    Usage byUsageCounts
    visibility views 7
    download downloads 24
  • 7
    views
    24
    downloads
    Powered byOpenAIRE UsageCounts
Powered by OpenAIRE graph
Found an issue? Give us feedback
visibility
download
selected citations
These citations are derived from selected sources.
This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Citations provided by BIP!
popularity
This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
BIP!Popularity provided by BIP!
influence
This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Influence provided by BIP!
impulse
This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
BIP!Impulse provided by BIP!
views
OpenAIRE UsageCountsViews provided by UsageCounts
downloads
OpenAIRE UsageCountsDownloads provided by UsageCounts
0
Average
Average
Average
7
24
Green
Funded by
Related to Research communities