Actas de congresos
Estimating The Quality Of Data Using Provenance: A Case Study In Escience
Registro en:
9781629933948
19th Americas Conference On Information Systems, Amcis 2013 - Hyperconnected World: Anything, Anywhere, Anytime. , v. 2, n. , p. 1442 - 1451, 2013.
2-s2.0-84893254834
Autor
Malaverri J.E.G.
Mota M.S.
Medeiros C.B.
Institución
Resumen
Data quality assessment is a key factor in data-intensive domains. The data deluge is aggravated by an increasing need for interoperability and cooperation across groups and organizations. New alternatives must be found to select the data that best satisfy users' needs in a given context. This paper presents a strategy to provide information to support the evaluation of the quality of data sets. This strategy is based on combining metadata on the provenance of a data set (derived from workflows that generate it) and quality dimensions defined by the set's users, based on the desired context of use. Our solution, validated via a case study, takes advantage of a semantic model to preserve data provenance related to applications in a specific domain. © (2013) by the AIS/ICIS Administrative Office All rights reserved. 2
1442 1451 IBM,SAP University Alliances,Microsoft,DePaul University,Georgia State University - J. Mack Robinson College of Business,et al Ballou, D., Modeling Information Manufacturing Systems to Determine Information Product Quality (1998) Manage. Sci, 44, pp. 462-484 Barga, R.S., Digiampietri, L.A., Automatic capture and efficient storage of e-Science experiment provenance (2008) Concurr. Comput.□: Pract. Exper, 20 (5), pp. 419-429 Batini, C., Scannapieco, M., (2006) Data Quality: Concepts, Methodologies and Techniques (Data-Centric Systems and Applications), , Springer-Verlag Blake, R., Mangiameli, P., The Effects and Interactions of Data Quality and Problem Complexity on Classification (2011) Journal of Data and Information Quality, 2 (2), pp. 1-28 Chapman, A.D., (2005) Principles of Data Quality, , Global Biodiversity Information Facility, Copenhagen Chen, P., Plale, B., Aktas, M.S., Temporal Representation for Scientific Data Provenance (2012) In Proc. 8th IEEE Int. Conf. On EScience 2012 Cugler, D.C., Medeiros, C.B., Toledo, F., An architecture for retrieval of animal sound recordings based on context variables (2012) Concurrency and Computation - Practice and Experience Davies, J., Studer, R., Warren, P., (2006) Semantic Web Technologies: Trends and Research In Ontology-based Systems, , Wiley (2010) The Dublin Core Metadata Initiative, , http://dublincore.org/, DCMI, Available at DeVries, P.J., (2009) GeoSpecies Ontology, , http://bioportal.bioontology.org/ontologies/1247, Available at (2009) Darwin Core Task Group, , http://www.tdwg.org/standards/450/, DwC, Available at Goodchild, M.F., Li, L., Assuring the quality of volunteered geographic information (2012) Spatial Statistics, 1, pp. 110-120 Hartig, O., Zhao, J., Using web data provenance for quality assessment (2009) In Proc. of the Workshop On Semantic Web and Provenance Management At ISWC (2011) The Kepler Project, , https://kepler-project.org/, Kepler, Available at Kondo, A.A., Traceability in Food for Supply Chains (2007) In Proc. 3rd Int. Conf. On Web Information Systems and Technologies (WEBIST), pp. 121-127. , INSTICC Lassila, O., Swick, R.R., (1999) Resource Description Framework (RDF) Model and Syntax Specification Malaverri, J.E.G., Medeiros, C.B., A Provenance-based Approach to Evaluate Data Quality in eScience (2013) Int. J. Metadata, Semantics and Ontology - Special Issue On Metadata For E-science and E-research Moreau, L., The Open Provenance Model core specification (v1.1) (2011) Future Generation Comp. Syst, 27 (6), pp. 743-756 Parssian, A., Managerial decision support with knowledge of accuracy and completeness of the relational aggregate functions (2006) Decis. Support Syst, 42, pp. 1494-1502 Pernici, B., Scannapieco, M., Data Quality in Web Information Systems (2002) In Proc. of the 21st Int. Conf. On Conceptual Modeling, pp. 397-413. , Springer-Verlag Pipino, L.L., Lee, Y.W., Wang, R.Y., Data Quality Assessment (2002) Commun. ACM, 45, pp. 211-218 Prat, N., Madnick, S., Measuring Data Believability: A Provenance Approach (2008) Proc. of the 41st Hawaii Int. Conf. On System Sciences, p. 393 Richard, Y., Diane, M., Beyond accuracy□: What data quality means to data consumers (1996) Journal of Management Sahoo, S.S., Sheth, A.P., Henson, C.A., Semantic Provenance for eScience: Managing the Deluge of Scientific Data (2008) IEEE Internet Computing, 12 (4), pp. 46-54 Simmhan, Y., Plale, B., Using Provenance for Personalized Quality Ranking of Scientific Datasets (2011) I. J. Comput. Appl, 18 (3), pp. 180-195 (2009) The Taverna Project, , http://www.taverna.org.uk/, Taverna, Available at (2011) The VisTrails Project, , http://www.vistrails.org, VisTrails, Available at (2012) The PROV Ontology, , http://www.w3.org/TR/prov-o/, W3C, Available at Wang, X., Gorlitsky, R., Almeida, J.S., From XML to RDF: How semantic web technologies will change the design of omic standards (2005) Nat Biotech, 23 (9), pp. 1099-1103 Yeganeh, S.H., Hassanzadeh, O., Miller, R.J., Linking Semistructured Data on the Web (2011) In Proc. 14th Int. Workshop On the Web and Databases Zhao, J., Mining Taverna's semantic web of provenance (2008) Concurr. Comput.□: Pract. Exper, 20, pp. 463-472