A carregar...
Projeto de investigação
Spatio-temporal analysis of unemployment rate at NUTS III level
Financiador
Autores
Publicações
Spatio-temporal methods and models for unemployment estimation
Publication . Pereira, Soraia Alexandra Gonçalves, 1989-; Turkman, Feridun, 1953-; Correia, Luís Paulo Fernandes
In Portugal, the National Statistical Institute (NSI) publishes official quarterly estimates of the labour market for the national territory and for NUTS I and NUTS II regions. NUTS is the nomenclature of territorial units for statistics, communly used by Eurostat, and has three different levels: NUTS I, NUTS II and NUTS III, depending on the disagregated level. The estimation is based on a direct method, using the data from the Labour Force Survey (LFS). However, for NUTS III regions, the sample size of the LFS is not enough to provide accurate estimates using this direct method. This problem is known as the small area estimation problem and it can arise in several disparate areas such as epidemiology, ecology, economics, social sciences, among others. Within the small area estimation (SAE) framework, several methods and models are suggested and they are centered around the basic Fay-Heriot model and its extensions in several directions. However, the assumptions made in these models are very restrictive and do not appear to be suitable in the context of unemployment. In this study we propose three alternative approaches for unemployment estimation in small areas. The first approach is based on generalized linear models (GLM) at areal level, where three different data modelling strategies are considered and compared: modelling of the total unemployed through Poisson, Binomial and Negative Binomial models; modelling of rates using a Beta model; and modelling of the three states of the labour market (employed, unemployed and inactive) by a Multinomial model. The second approach is based on spatial point processes. From the 4th quarter of 2014 onwards, all the sampling units of the LFS are georeferenced, mainly the residential buildings. For that reason, we propose using this new data, together with the information specific to the families to model the intensity of points and the marks associated to those points, through a marked log Gaussian Cox processes model. Here, the points are the residential buildings, whereas the associated marks are the number of unemployed people residing in these buildings. The basic assumption behind this model is that, although we know the geo-referenced positions of the residential units in the labour sample survey, we do not know the spatial configuration of all residnetial units in the population and therefore, we take the sampled residential units as a realization of a spatialpoint process. Recently, the NSI provided us with information about the locations of all residential buildings in the national territory. Consequently, it is no longer necessary to model the points, as all the residential buildings are georeferenced. Thus, the third method we propose is based on a point referenced data model, also described as a geostatistical model. This model assumes that the points in the population are fixed and the interest is to model the spatial variation of the marks. The modelling process is based on a spatial extrapolation of the unemployment figures from the 14000 residential buildings sampled in the LFS to all other known residential units not sampled by the labour survey. A comparison between the mentioned models, the direct method and the traditional small area models, shows that the geostatistical model is the most favorable due to the good behaviour in terms of variability and the detailed information it can provide. We follow a Bayesian approach and the inference is made using the package R-INLA in the software R.
Unidades organizacionais
Descrição
Palavras-chave
Contribuidores
Financiadores
Entidade financiadora
Fundação para a Ciência e a Tecnologia
Programa de financiamento
Número da atribuição
SFRH/BD/92728/2013
