The last five databases
The FPB not only publishes the results of its research, but also a set of databases. These databases can contain either historical data or forecasts.
In this database, domestic employment, labour volume (hours worked) and the wage costs of wage earners are broken down per industry (A38 of Nace rev. 2) according to gender, age group and education level. The breakdowns of employment and labour volumes are available for wage earners and self-employed people separately. The database provides annual results for the period 1999-2020, corresponds with the last edition of the national accounts (October 2021) and is the source for the employment, labour volumes and wage costs data in the EUKLEMS database for Belgium.
The methodology used still corresponds broadly with the one described in Working Paper 02-07 of the Federal Planning Bureau  apart from the fact that, besides persons employed, labour volumes are now processed this way as well. For estimating wage costs by education level, a new methodology was developed . The education levels concerned are primary or lower secondary education, higher secondary education, higher short-type education, and higher long-type or university education.
The industry totals on employment data per status (wage earners/self-employed) in the last edition of the national accounts (October 2021) are the starting point for the database. In a first stage, these figures are further broken down using administrative data from social security institutions . Thus, for wage earners, the number of persons employed, the labour volume and the wage costs per national accounts industry are split according to gender, age group, sub-status (blue collar workers, white collar workers and civil servants) and working-time regime (full-time and part-time workers) on the basis of social security data. For the self-employed, the number of persons employed per national accounts industry is split according to gender, age class and sub-status (self-employed in the strict sense, paid and non-paid assistants).
The social security data are merged with the NBB’s company register at the firm/institution level, which guarantees a match with the industry classification used in the national accounts. Furthermore, raw social security data are cleaned of time series breaks as much as possible and where possible adjusted to concepts that are commonly used in the national accounts. Every remaining discrepancy with the national accounts is removed by proportionally raising the figures to the industry totals of the national accounts.
In a second stage, breakdowns that are not available in administrative data are generated by using survey data. For example, the labour force survey (LFS) was used to break down the labour volumes of the self-employed according to age group and gender. The split of persons employed and labour volumes according to education level was also based on this survey, both for wage earners and self-employed.
The LFS is a survey among 1% of the working age population. Regression techniques, not averages, were used to get stable results for all subgroups. The number of persons employed per education level was assessed through logistic regression. The labour volume of the self-employed and the average labour volume per education level were assessed by using OLS.
All estimates were made separately per gender and A38-industry for four sub-statuses (blue collar workers, white collar workers, civil servants & contract workers, self-employed & assistants) and for six large age groups (15-19,20-24,25-39,40-49,50-59,60+). When there was a lack of observations, subgroups were merged.
To capture the impact of Covid lockdowns, a dummy for the year 2020 was added to the regressions breaking down the persons employed by education level and to the regressions used for estimating the labour volume by education level . In the former, the dummy can capture an increase or a decrease in the level of education in 2020. For small groups, it should be taken into account that the LFS regressions are based on only 1 % of the labour force. The 2020 dummy added to the regressions for work duration allows for the overall negative effects of lockdowns on the number of hours worked to be captured. There are no dummies to estimate a cross-effect by education level. As mentioned, the regressions are estimated by sub-status and therefore a possible effect of covid by sub-status is taken into account.
Finally, the wage costs of wage earners were broken down according to the highest education level attained. The calculation, made in two steps, was based mainly on the Wage Structure and Distribution Survey.
First, the gross wages of wage earners by industry were detailed by combining the previously calculated labour volumes with education level premiums. These education level premiums for gross wages were estimated using the results of the Wage Structure and Distribution Survey for the 2000-2019 period. By using regression techniques, annual education level premiums could also be generated for a number of industries that were polled only every four years in the survey , as well as education level premiums for the years 1999 and 2020.
In a second stage, wage costs based on previously determined totals per gender, age group, industry, working-time regime and sub-status of wage earners were proportionally split between education levels according to the breakdown of gross wages. The breakdowns of wage costs that were calculated using administrative data are fully respected, but any variation in employer contributions or other employer costs between levels of education within the same industry, age group, sub-status and gender is not captured.
Three industries are not incorporated in the Wage Survey: Agriculture, forestry and fishing (A), Public administration and defense; compulsory social security (O) and Activities of households as employers (T). Education level premiums were calculated for the net wages in these industries in LFS data. As regards the industry Public administration and defense; compulsory social security (O), which gave work to 10.9% of wage earners in 2020, an additional adjustment was made for the conversion of net wages into gross wages. This adjustment is based on the net/gross wages ratios for comparable subgroups within the industry Education (P).
The database does not yet contain any breakdown of the mixed income of self-employed people per gender, age class or education level.
 Bresseleers, e.a. Kwalitatieve werkgelegenheidsdata voor België, een SAM-aanpak voor de periode 1999-2005. Working Paper 02-07, Federal Planning Bureau, 2007.
 A bottom-up approach is followed for wage costs. Totals per industry for different components of wage costs (gross wages; employers’ social contributions) are broken down separately using administrative data. The results for the wage costs follow from aggregation.
 The National Social Security Office (NSSO), the National Social Security Office for Provincial and Local Public Services, and the National Institute for the Social Security of the Self-Employed (NISSE).
 All regressions for work duration include a dummy for the year 2020. In the regressions breaking down the number of employed persons by education level, a dummy for 2020 was added only for blue-collar and white-collar workers in the private sector. For contract employees and civil servants in the public sector as well as for the self-employed, lockdowns were assumed to have no impact on the breakdown of employed persons by education level.
 As regards the industries Education (P), Human health activities (QA), Social work activities (QB), Arts, entertainment and recreation (R) and part of Other service activities (S), the Wage Structure and Distribution Survey has observations for the years 2006, 2010, 2014 and 2018 only.
Structural studies > Productivity and long-term growth
Sectoral accounts and analyses > Input-output tables and extensions
Sectoral accounts and analyses > Analyses and applications