Advanced Techniques in Applied Economics

Logo

Spring 2026 · UPF Graphical models, causal discovery, latent variables, interference, and positive dependence.

View the Project on GitHub pzwiernik/advanced-applied-econ

Individual Research Projects

This course connects modern structural statistics — graphical models, latent variable methods, causal discovery, interference, and structured dependence — with questions that arise in applied economics.

Expectations for the final project

You will have about 3–4 weeks to complete this project. The goal is not to produce a publishable paper or genuinely new empirical results. Instead, the goal is to engage seriously with one specific methodological idea from the course and show that you understand:

You may approach the project from either an applied or a theoretical perspective.

Option A: Applied / empirical / simulation track

For students interested in applied work, the goal is to take one method, apply it, and test its boundaries. A successful project will usually:

  1. Implement an estimator or algorithm in R or Python.
  2. Apply it to a real dataset or a carefully designed simulation.
  3. Evaluate how the method behaves when its assumptions are plausible, questionable, or clearly violated.

Examples:

Option B: Theory / methodology track

For students leaning toward econometrics, statistics, or theory, the project does not need to involve a real dataset. A successful project might:

  1. Clarify the logic of a method or identification argument.
  2. Compare two related approaches and explain where they differ.
  3. Work through a theoretical example, proof sketch, counterexample, or simulation that reveals the role of the assumptions.
  4. Adapt a method to a specific economic setting or explain why such an adaptation is difficult.

A project is still successful even if the method “fails,” as long as the failure is clearly explained and tied to the assumptions.


Suggested project topics

Category 1: Causal discovery and directional structure

These topics focus on what can and cannot be learned about causal direction from observational data.

1. Constraint-based causal discovery and the fragility of Gaussian assumptions

Objective: The PC algorithm is a constraint-based method built from conditional independence tests. Study how its output changes when different CI tests are used, especially in settings where Gaussian assumptions are questionable. Apply the method to simulated data or to a small macroeconomic or financial example.

2. Identifying direction via non-Gaussianity: LiNGAM

Objective: In linear Gaussian SEMs, direction is often not identified. LiNGAM shows that non-Gaussian shocks can break this symmetry. Study the logic of LiNGAM, implement a simple example, and explain clearly why non-Gaussianity helps.

3. Causal discovery with hidden confounding: DAGs versus MAGs

Objective: Compare what can be learned when causal sufficiency is assumed and when it is not. Use DAG-based output and MAG-based output to study how hidden confounding changes the graphical summary.

4. Additive noise models as a nonlinear alternative to LiNGAM

Objective: LiNGAM uses linearity plus non-Gaussianity. Additive noise models use nonlinearity plus independence of the noise. Explain the identification logic and compare these two routes to directional discovery.


Category 2: Dynamic and network structure

These topics explore how graph-based ideas enter applied economics when units interact or when dependence is dynamic.

5. Production networks and aggregate fluctuations

Objective: Microeconomic shocks can propagate through input-output or supply-chain networks. Study how network structure amplifies or dampens shocks, and relate this to sparse dependence or transmission graphs.

6. Peer effects and interference in school or village networks

Objective: When one unit’s treatment affects another unit’s outcome, SUTVA fails. Study one simple interference design or paper and explain how direct and spillover effects are separated.

7. Information diffusion in village networks

Objective: Study how information or technology adoption spreads over a network. Focus on exposure mappings, targeting, or the role of central nodes.

8. Network estimation for high-dimensional time series

Objective: In multivariate time series, one may want to distinguish lagged dependence from contemporaneous conditional dependence of shocks. Study one method for constructing such networks and explain what the resulting graph means.

9. Policy targeting under network interference

Objective: When spillovers are present, the best treatment rule depends on the whole network, not only on unit-level effects. Study one recent paper and explain the targeting logic carefully.

Category 3: Hidden confounding, proxies, and high-dimensional adjustment

These topics focus on situations where the confounders are not directly observed or are too numerous to handle naively.

10. Double machine learning for treatment effects

Objective: Study how DML uses nuisance estimation plus orthogonalization to estimate treatment effects with many controls. Explain why naive machine learning is not enough.

11. Proxy controls for an unobserved confounder

Objective: Study the idea that multiple noisy proxies can help recover information about a hidden confounder. Explain what is and is not identified, and illustrate with a simulation.

12. Text as a proxy for hidden institutional or policy variables

Objective: Study whether text-derived variables (topics, embeddings, sentiment, etc.) can help proxy for hidden confounders or latent institutional states in an economic application.

13. Synthetic instruments and many-instrument methods

Objective: Study the many-instrument problem and how regularization can be used to construct a useful instrument from many candidate variables. Explain the distinction between a classical instrument and a constructed instrument.

14. Latent factor adjustment for hidden confounding

Objective: When many observables share a common hidden source, a factor model or principal component adjustment may partially recover the confounder. Study the logic, assumptions, and limitations of this approach.


Category 4: Latent variable models and hidden structure

These topics focus on using latent variables to represent hidden heterogeneity, hidden classes, or hidden dependence structure.

15. Gaussian mixtures and hidden segmentation

Objective: Use Gaussian mixtures or latent class models to represent hidden subpopulations. A good project compares latent classes with factor models and explains when each is more appropriate.

16. Uncovering latent worker and firm types

Objective: Study how latent-type ideas enter matched employer-employee data. This project may be more conceptual than computational if the data are too difficult to access.

17. Non-independent component analysis

Objective: Standard ICA assumes independent latent components. Study what changes when latent components are not independent, and explain how this broadens the latent-variable perspective.

18. Matrix completion and latent panel structure

Objective: Study matrix-completion or synthetic-control style methods as latent-factor approaches to causal panel data.


Category 5: Positive dependence and robust graph estimation

These topics focus on dependence structures that go beyond sparsity alone.

19. Total positivity in macro or financial comovement

Objective: Study whether a positive-dependence assumption such as MTP(_2) is plausible in an economic dataset, and explain what statistical benefits it gives for covariance or graph estimation.

20. Robust financial networks via non-paranormal or elliptical models

Objective: Compare Gaussian graphical models with more robust alternatives for financial data, such as non-paranormal or elliptical partial correlation graphs.


Proposing your own topic

You are strongly encouraged to propose your own topic, especially if it connects to an ongoing thesis or research interest.

A custom topic should satisfy three criteria:

  1. Relevance: it must clearly connect to a method or idea from the course.
  2. Core component: it must include either an implementation / simulation / empirical illustration, or a serious methodological / theoretical analysis.
  3. Critical evaluation: it must go beyond description and say something about assumptions, limits, or interpretation.

To propose your own topic, e-mail me describing: