A Semantic Metadata Enrichment Software Ecosystem (SMESE): Its Prototypes for Digital Libraries, Metadata Enrichments and Assisted Literature Reviews

Keywords：Applied Science, Emotion Detection, Rich Literature Review, Machine Learning and Digital Libraries, Rich Semantic Metadata, Semantic Topic Detection, Sentiment Analysis

Abstract：A Semantic Metadata Enrichment Software EcosystemSMESEIts Prototypes for Digital LibrariesMetadata Enrichments and Assisted Literature Reviews Contribution 1Initial design of a semantic metadata enrichment ecosystemSMESEfor Digital LibrariesThe Semantic Metadata Enrichments Software EcosystemSMESE V1for Digital LibrariesDLsproposed in this paper implements a Software Product Line EngineeringSPLEprocess using a metadata-based software architecture approachIt integrates a components-based ecosystemincluding metadata harvestingtext and data mining and machine learning modelsSMESE V1 is based on a generic model for standardizing meta-entity metadata and a mapping ontology to support the harvesting of various types of documents and their metadata from the webdatabases and linked open dataSMESE V1 supports a dynamic metadata-based configuration model using multiple thesauriThe proposed model defines rules-based crosswalks that create pathways to different sources of data and metadataEach pathway checks the metadata source structure and performs data and metadata harvestingSMESE V1 proposes a metadata model in six categories of metadata instead of the four currently proposed in the literature for DLsthis makes it possible to describe content by defined entitythus increasing usabilityIn additionto tackle the issue of varying degrees of depththe proposed metadata model describes the most elementary aspects of a harvested entityA mapping ontology model has been prototyped in SMESE V1 to identify specific text segments based on thesauri in order to enrich content metadata with topics and emotionsthis mapping ontology also allows interoperability between existing metadata modelsContribution 2Metadata enrichments ecosystem based on topics and interestsThe second contribution extends the original SMESE V1 proposed in Contribution 1Contribution 2 proposes a set of topic-and interest-based content semantic enrichmentsThe improved prototypeSMESE V3see following figureuses text analysis approaches for sentiment and emotion detection and provides machine learning models to create a semantically enriched repositorythus enabling topic-and interest-based search and discoverySMESE V3 has been designed to find short descriptions in terms of topicssentiments and emotionsIt allows efficient processing of large collections while keeping the semantic and statistical relationships that are useful for tasks such as1topic detection2contents classification3novelty detection4text summarization5similarity detectionContribution 3Metadata-based scientific assisted literature reviewThe third contribution proposes an assisted literature reviewALRprototypeSTELLAR V1Semantic Topics Ecosystem Learning-based Literature Assisted Reviewbased on machine learning models and a semantic metadata ecosystemIts purpose is to identifyrank and recommend relevant papers for a literature reviewLRThis third prototype can assist researchersin an iterative processin findingevaluating and annotating relevant papers harvested from different sources and input into the SMESE V3 platformavailable at any timeThe key elements and concepts of this prototype are1text and data mining2machine learning models3classification models4researchers annotations5semantically enriched metadataSTELLAR V1 helps the researcher to build a list of relevant papers according to a selection of metadata related to the subject of the ALRThe following figure presents the modelthe related machine learning models and the metadata ecosystem used to assist the researcher in the task of producing an ALR on a specific topicContribution 1Un cosystme denrichissements smantiques des mtadonnesSMESEpour des bibliothques digitalesLcosystme de logiciels denrichissements de mtadonnes smantiquesSMESE V1propos dans ce travail de recherche a implment une approche dingnierie de ligne de produits logicielsSPLEutilisant une architecture logicielle base sur les mtadonnesCet cosystme est bas sur le moissonnage de mtadonneslexploration de textes et de donnes et les modles dapprentissage automatiqueSMESE V1 est bas sur un modle gnrique de normalisation dentitsde mtadonnes et dontologies croises capables de supporter le moissonnage de tout type de documents et de leurs mtadonnes partir du Web structur et du Web non structur ainsi que des donnes ouvertes et liesLe design de SMESE V1 inclue un modle de reconfiguration dynamique bas sur les mtadonnes et sur plusieurs thsaurus par domaine dapplicationLe modle propos dfinit des rgles de traduction ou de moissonnage qui crent des interfaces vers diffrentes sources de donnes et mtadonnesChaque interface vrifie la structure de la source de mtadonnespuis effectue le moissonnage des donnes et des mtadonnesSMESE V1 propose un modle de mtadonnes avec six catgories de mtadonnes au lieu des quatre utilises actuellement dans la littrature affrente aux bibliothques digitalesCe modle permet de mieux dcrire les contenus afin daccroitre leur utilisabilitEn plusafin de rsoudre la question des degrs de profondeur des mtadonnesle modle de mtadonnes propos dcrit les aspects les plus lmentaires dune entit moissonne correspondant une structure de donnesSMESE V1 inclue un modle de mise en correspondance ontologique qui permet didentifier des segments de texte spcifiques en utilisant des thsaurus pour enrichir les contenus de nouvelles mtadonnes relies lidentification des sujets et des motionsCe modle de mise en correspondance ontologique permet galement linteroprabilit entre les modles de mtadonnes existantsContribution 2Un cosystme denrichissements mtadonnes bas sur les sujets et intrtsLa contribution 2 prsente une mise en oeuvre amliore de la version originale de SMESE V1propos dans la contribution 1en effetla contribution 2 propose des enrichissements de contenu bass sur les sujets et les intrtsCe prototype amlior SMESE V3voir figure 1utilise des approches danalyse de texte pour la dtection des sentiments et des motionsIl cre un rfrentiel smantique enrichi de mtadonnes qui permettent la recherche et la dcouverte bases sur les intrtsIl a t conu pour trouver de courtes descriptionsen termes de sujetsde sentiments et dmotionsIl permet un traitement efficace de grandes collections de donnes tout en prservant les relations smantiques et statistiques utiles pour des tches telles que1dtection de sujets2classification de contenus3dtection de nouveauts4synthse de textes5dtection de similitudeContribution 3Une revue de littrature scientifique assisteLa contribution 3 propose un prototypeSTELLAR V1-Semantic Topics Ecosystem Learningbased Literature Assisted Review V1qui permet dassister les chercheurs dans leurs processus de prparation dune revue de littratureCe prototype de revue de littrature assiste est bas sur un cosystme de mtadonnes smantiquesIl permet didentifierdvaluer et de recommander les articles scientifiques pertinents pour une revue de littratureLe troisime prototypeSTELLAR V1permet itrativement de trouverdvaluer et dannoter les articles pertinents disponibles dans la plateforme SMESE tout momentLes lments et concepts cls utiliss par le prototype STELLAR V1 sont1lexploration de textes et des donnes2les modles dapprentissage automatique3les modles de classification4les articles annots des chercheurs5les mtadonnes enrichies smantiquementCe prototype aide identifier et recommander les articles pertinents et leur classement li un sujet spcifique selon la slection des chercheursLa figure suivante prsente le modleles processus associs et lcosystme des mtadonnes pour aider le chercheur dans la tche de produire une revue de littrature relie un sujet spcifiqueCollection ClassificationClassification of content classification through machine learning and NLPRetrieval and IndexingData retrieval and indexes through theme detection and similarity detection and using semantic data ecosystemsInstitutional CollectionCai and integrate files from different sources and metadata from different sources through ecosystems such as SMESE V1 and SMESE V3 NLPNatural Language TreatmentThe article refers to the exploration of text and datawhich involves the analysisunderstanding and generation of languagewhich is the core area ofNLPPattern recognitionClassify and similarly detect the content through the machine learning modeland the data needs to be identified by the dataMachine LearningIn order to classify the content classificationtheme detectionetcthe machine learning model was usedDeep LearningAlthough deep learning is clearly mentioned in the articleconsidering that most of the high-efficiency NLP models today depend on deep learning technologyit includes itData MiningThe article refers to the exploration of text and datawhich is part of the data explorationthe purpose is to extract useful information from a large amount of data

A Semantic Metadata Enrichment Software Ecosystem (SMESE): Its Prototypes for Digital Libraries, Metadata Enrichments and Assisted Literature Reviews

網站意見回饋

網站意見回饋

聯絡我們

聯絡我們