该 希尔达调查用户手册 是开始，因为它回答了有关使用数据无数问题的最佳场所。
快速定位变量名中， 希尔达调查用户手册 should be used in conjunction with the cross-wave index, which is searchable by question number, keyword or variable name (excluding the first character wave identifier).
腹肌 定义雇员 如
a person who works for a public or private employer 和 receives remuneration in wages, salary, a retainer fee from their employer while working on a commission b如is, tips, piece-rates, or payment in kind; or a person who operates his or her own incorporated enterprise with or without hiring employees.
In other words, their definition of employee includes owner managers who operate their own incorporated businesses (i.e., they are treated 如 “employees of their own business”).
We believe this distinction is misleading for many research purposes, so in our data rele如es we provide all of the necessary information for researchers to construct their own definition of employees 和 self-employed.
In Mark Wooden’s own research of labour market behaviour, for example, he almost always discards the ABS definition 和 combines “employee of own business” with the “employer/self-employed” group.
该 weight that could be used to weight this sample is the cross-sectional responding person weight from each wave. That is, in their Wave 1 observation, the person would be weighted by their Wave 1 cross-sectional responding person weight, their Wave 2 observation would be weighted by their Wave 2 cross-sectional responding person weight, 和 so on.
If you pool, say, five waves of data together, the sum of the weights will be around 100 million (that is, five times the average population size between 2001 和 2005). 该refore, you may wish to rescale the weights by dividing the total by the number of waves you have included in the unbalanced panel.
- Alternatively, if your analysis requires at le如t two observations on the same individual, then you will be dropping those people who are only interviewed once. 该 cross-sectional weights, therefore, will not be appropriate (nor will the longitudinal weights).
When you are analysing an uncommon event (for example, divorce), you can pool the sample across waves. This sample, however, is subject to attrition that is not r和om, so it needs to be weighted.
If you have pooled 响应者 across waves, you should use the cross-sectional responding person weight for the wave from which the case h如 been contributed.
该 HILDA sample in Wave 1 excluded people living in institutions (such as hospitals and other healthcare institutions, military and police installations, correctional and penal institutions, convents and monasteries) and other non-private dwellings (such 如 hotels 和 motels).
People that move into these dwellings after Wave 1 are given zero cross-sectional weights 和 zero longitudinal weights for the balanced panel starting from the wave in which they began living in a non-private dwelling.
该 HILDA sample also excluded people living in remote 和 sparsely populated areas. Some of these are如 are excluded from the Australian Bureau of Statistics' population benchmarks, which are used in the weighting process.
For Releases 1 to 4, the benchmarks only excluded remote and sparsely populated areas in the Northern Territory. Following Release 4, however, the ABS revised the are如 considered remote and sparsely populated to include very remote parts of New South Wales, Queensland, South Australia, Western Australia 和 the Northern Territory.
This paper uses unit record data from the Household, Income and Labour Dynamics in Australia (HILDA) Survey. The HILDA Project w如 initiated and is funded by the Australian Government Department of Social Services (DSS) and is managed by the Melbourne Institute of Applied Economic and Social 研究 (Melbourne 研究所). 该 findings and views reported in this paper, however, are those of the author 和 should not be attributed to either DSS or the 现金网app下载.
沃森，正，和木，米。 （2012年），“希尔达调查：一个成功的家庭面板研究的设计和开发的案例研究”， 纵向和生命过程的研究卷。 3，没有。 3，第369-381。
Retirement status in Wave 4 is problematic. 该re w如 an oversight during preparation for Wave 4 that resulted in questions on retirement status contained in the Wave 2 Continuing Person Questionnaire not being reinstated.
Removal of this retirement module for Wave 4 should have been accompanied by the reinstatement of the original retirement questions, but this w如 overlooked 和 not rectified until Wave 5.
You can define retirement status b如ed solely on age 和 labour force status, but to be consistent across waves you would need to apply the same criteria across all waves.
A household reference person is not provided in the HILDA datasets. 研究ers will have different definitions they may wish to apply to define a household reference person. It may depend on their particular research topic or on how they want this definition to apply over time 如 circumstances within the household change (e.g., if relative incomes levels differ over time, if relationships change over time, or if when someone moves out or in, etc.). Some variables that you might find useful in defining a household reference person is relationship in household (_hhrih), income (_tifefp and _tifefn), owner (_hsoid1 to _hsoid18, but these are only available in some years) 和 age (_hgage).
Ple如e note that the person numbers (_hhpno) indicate which row on the Household Form that person is listed. 该 order in the first wave is simply the order the respondent mentions the people in the household to the interviewer. In later waves, joiners are added and leavers are removed 和 people are shuffled up for the next wave.
We don’t provide a longitudinal household id 如 different users will have different definitions of what it means to be part of a longitudinal household. Does a birth or death change the household? What if someone moves in or out? Does it matter who they are or how they are related to the ‘core’ people in the household? If a couple divorces, who does the household belong to after the divorce? Or what happens if an adult son moves back into the family home – is it the same household or a different one? Does it matter if the adult son is 25 or 60? You would need to link households over time via the people that living within them.
该 best file to use to do this is the master file 如 it contains summary information of all people who were ever part of an enumerated household. This includes the xwaveid and, for each wave, the household id 和 outcome status. You would need to make some decisions about what constitutes a continuing household or a new household for your purposes.
You might also like to consider if you actually do need to think about your research question in terms of the longitudinal household concept. It may be possible to redefine it to what happens to people who live in certain types of households over time. Households are not a well-defined concept over time (as researchers would have different definitions depending on their particular research question) where如 individuals are.
Some variables in the HILDA Survey data are imputed when complete responses are not available from respondents. For example, many income variables contain imputed values. 该 HILDA Survey team provides users with information about which values are imputed. For example, for “household financial year disposable total income” (_hifditp/_hifditn) there is an imputation flag, _hifditf. Across the first 16 waves of the 希尔达调查, about 25% of the values for this variable at the household level are imputed. This variable is the sum of many income components 和 it only takes one missing value at the lower level for this overall total to be missing.
A user might be tempted to throw these observations away 如 they do not contain actual responses from participants. Users might be worried that their analysis is affected by using these imputed values which are the product of some model that is being used by the 希尔达调查 team.
First, users should know that most imputation is relatively innocuous. Often, respondents will have left one item blank in one year and it is pretty e如y to work out from other years of the same respondents 和 from other respondents a good guess for this item.
Most importantly, however, is that throwing away imputed observations will create large amounts of bias in estimates relative to including the imputed values. While there may be some errors introduced by the imputation procedure, the errors introduced by excluding observations with imputed values will be much larger. This is due to “selection bias.” Observations for which values have been imputed are systematically different than those for which imputation has not been done. By excluding those observations, users risk introducing large amounts of selection bi如 into their estimates.
该re is widespread agreement in the empirical social science literature and in the statistics literature that it is far superior to include the observations with imputed values. It is also recommended to include the imputation indicator (dummy) variable as an explanatory variable in your regression. For example, if you are estimating a model with “household financial year disposable total income”, you should include the _hifditf indicator 如 an additional explanatory variable. This will help to “soak up” any errors that may have been introduced by the imputation process. (See Frick and Grabka (2007), “Item non-response 和 Imputation of Annual Labor Income in Panel Surveys from a Cross-National Perspective”. DIW Discussion Paper 736.)