dc.contributorCiferri, Ricardo Rodrigues
dc.contributorhttp://lattes.cnpq.br/8382221522817502
dc.contributorSantos, Marilde Terezinha Prado
dc.contributorhttp://lattes.cnpq.br/9826026025118073
dc.contributorhttp://lattes.cnpq.br/5342428610131873
dc.creatorBorges Junior, Sergio Ricardo
dc.date.accessioned2017-07-19T10:45:12Z
dc.date.available2017-07-19T10:45:12Z
dc.date.created2017-07-19T10:45:12Z
dc.date.issued2016-12-16
dc.identifierBORGES JUNIOR, Sergio Ricardo. SEnsembles – uma abordagem para melhorar a qualidade das correspondências de instâncias disjuntas em estudos observacionais explorando características idênticas e ensembles de regressores. 2016. Tese (Doutorado em Ciência da Computação) – Universidade Federal de São Carlos, São Carlos, 2016. Disponível em: https://repositorio.ufscar.br/handle/ufscar/8911.
dc.identifierhttps://repositorio.ufscar.br/handle/ufscar/8911
dc.description.abstractIntroduction. The datasets used in observational studies have instances belonging to two distinct groups (i.e. treatment group and control group), which are compared in order to estimate the effect of the treatment over the results. For such, in one of the approaches, called Propensity Score Matching (PSM), the propensity score for the instances of both groups is estimated and, subsequently, the correspondence of these instances is performed based on the values for the propensity score. The propensity score is the probability of attribution of a treatment based on the observed characteristics (e.g. income, sex and age). In this context, the logistic regression is widely used to estimate the propensity score and there is an great variety of instance correspondence methods. Objective. This doctor´s thesis has as its main objective to investigate computational alternatives in order to improve the quality of the instance correspondence in datasets that are manipulated in observational studies. Methodology. Techniques that estimate the propensity score and methods to perform the instance correspondence in observational studies were investigated. Thus, it was possible to investigate how the identical characteristics of the instances could be exploited in a new process to perform correspondence and, how ensembles could substitute the logistic regression by estimating the propensity scores of the instances, in the context of the PSM process. Proposal. This thesis proposes a new approach in the context of the PSM process, called “SEnsembles”, which aims to improve the quality of instance correspondence based on two main processes, which use techniques that separately consider the identical characteristics of the instances and the ensembles of regressors, more precisely, bagging, random forest and boosting. Results. The proposed approach “SEnsembles” improves the quality of the instance correspondence for the majority of calipers used (i.e. zero, 0.05, 0.10, 0.15, 0.20, 0.25 and 0.30) when compared to the baseline Nearest Neighbor Matching (NNM). Based on the experiments, when there was an improvement over the baseline, the technique that separates the identical characteristics of the instances presented improvements of up to 53.8% in the quality of correspondence, with an average of gains of 12.1%; and only 2.7% of average in the reduction of the number of pairs of instances matched. The technique which substituted the logistic regression for ensembles of regressors, in turn, presented the best correspondence with the caliper zero and with the values 0.20, 0.25 and 0.30, with improvements of up to 36.3% and an average of gains of 12.7%; and a slightly reduction of 7.6% in the number of pairs of instances matched.
dc.languagepor
dc.publisherUniversidade Federal de São Carlos
dc.publisherUFSCar
dc.publisherPrograma de Pós-Graduação em Ciência da Computação - PPGCC
dc.publisherCâmpus São Carlos
dc.rightsAcesso aberto
dc.subjectEstudos observacionais
dc.subjectCorrespondência de instâncias
dc.subjectEscore de propensão
dc.subjectObservational studies
dc.subjectInstance correspondence
dc.subjectPropensity score
dc.subjectEnsembles
dc.titleSEnsembles – uma abordagem para melhorar a qualidade das correspondências de instâncias disjuntas em estudos observacionais explorando características idênticas e ensembles de regressores
dc.typeTesis


Este ítem pertenece a la siguiente institución