Back to top anchor
Working paper

The Elasticity of Taxable Income, Welfare Changes and Optimal Tax Rates (WP 13/24)

Abstract

This paper provides a technical introduction to the use of the elasticity of taxable income in welfare comparisons and optimal tax discussions. It draws together, using a consistent framework and notation, a number of established results concerning marginal welfare changes and optimal taxes. Particular attention is given to the way value judgements can be specified when using this approach, and results are illustrated using the New Zealand income tax. In addition, some new results, particularly in terms of non-marginal tax changes, are presented.

This Working Paper is available in Adobe PDF and HTML. Using PDF Files

Acknowledgements

I am very grateful to Martin Keene for many detailed comments, including pointing out an error in an earlier draft. I have also benefited from comments and suggestions by Simon Carey, Norman Gemmell, Jose Sanz-Sanz and Michael Freudenberg.

Disclaimer

The views, opinions, findings, and conclusions or recommendations expressed in this Working Paper are strictly those of the author. They do not necessarily reflect the views of the New Zealand Treasury or the New Zealand Government. The New Zealand Treasury and the New Zealand Government take no responsibility for any errors or omissions in, or for the correctness of, the information contained in these working papers. The paper is presented not as policy, but with a view to inform and stimulate wider debate.

Executive Summary

The concept of the elasticity of taxable income has become widely used in both the positive literature on the behavioural incentive effects of income taxation and in the normative literature on welfare effects and optimal taxation. This elasticity is defined as the elasticity of taxable income with respect to the net-of-tax rate (one minus the marginal tax rate), and is therefore positive. The attractions are that its use eliminates the need to construct and estimate a fully specified structural model of taxpayers' behaviour, and optimal tax rates can be readily discussed and expressed explicitly in terms of the elasticity of taxable income. These advantages nevertheless come at a cost, in terms of the difficulties of empirical estimation and the strong underlying assumptions required to generate some of the results.

The aim of the present paper is to provide a technical introduction to the use of the elasticity of taxable income in welfare comparisons and optimal tax discussions. It draws together, using a consistent framework and notation, a number of established results concerning marginal welfare changes and optimal taxes. In addition, it presents some new results, particularly in terms of non-marginal tax changes.

In the ‘standard' optimal tax literature, in the context of a tax and transfer system where labour supply responds to tax change, a starting point is a social welfare, or evaluation, function expressed in terms of individuals' utilities. This welfare function reflects the value judgements of an independent judge, in particular regarding the judge's aversion to inequality. In the present context a social welfare function is not fully specified but a judge is assumed to take a view about the value of additional government tax-financed expenditure resulting from the extra revenue from a small tax increase. The additional expenditure is not explicitly divided into transfer and other expenditure. The independent judge also forms a view about the weight attached to the loss of welfare resulting from the small tax increase.

The main concepts required here are the social marginal valuation, SMV , which reflects the weight attached to the loss of welfare suffered by those in the relevant tax bracket as a result of a small tax increase, and the marginal value of public funds, MV PF, which is the value attributed by the judge to the extra tax-financed expenditure resulting from the small tax increase.

Examining the implications of adopting alternative value judgements involves examining the effects on optimal tax rates of alternative values of the ratio SMV / MV PF. The question therefore arises of how to interpret different orders of magnitude. The paper seeks to make assumptions more transparent.

Both the standard optimal tax approach and the use of the elasticity of taxable income involve the use of highly simplified models, both in terms of the economic environment and the behaviour of individuals. Neither approach can of course be expected to provide detailed practical policy advice. However, they can both be used, in their different ways, to illuminate and clarify different aspects of the complex relationships involved in choosing a tax rate structure.

1 Introduction

The concept of the elasticity of taxable income has, following Feldstein (1995), become widespread in both the positive literature on the behavioural incentive effects of income taxation and in the normative literature on welfare effects and optimal taxation. [1] This elasticity is defined as the elasticity of taxable income with respect to the net-of-tax rate (one minus the marginal tax rate), and is therefore positive. For example, much use was made of this elasticity, following the contribution of Saez (2001), in the report chaired by Sir James Mirrlees, which consisted of two substantial volumes (Mirrlees, 2010, 2011) produced under the aegis of the Institute for Fiscal Studies in London. The attraction is obvious: the use of a reduced-form approach eliminates the need to construct and estimate a fully specified structural model of taxpayers' behaviour, and optimal tax rates can be readily discussed and expressed explicitly in terms of the elasticity of taxable income. Such elasticities are ‘bread and butter' to economists and may be regarded as being more 'concrete' than the elements which enter into the determinants of optimal tax rates in the types of structural labour supply model which followed the earlier work of Mirrlees (1971). [2] These advantages nevertheless come at a cost, in terms of the difficulties of empirical estimation and the strong underlying assumptions required to generate some of the results.

The aim of the present paper is to provide a technical introduction to the use of the elasticity of taxable income in welfare comparisons and optimal tax discussions. It draws together, using a consistent framework and notation, a number of established results concerning marginal welfare changes and optimal taxes, in addition to presenting some new results, particularly in terms of non-marginal tax changes.

Section 2 introduces the quasi-linear utility function that is implicit in all studies which use a constant elasticity specification in which there are no income effects on taxable income of marginal tax rate changes. Section 3 discusses the revenue and welfare effects, measured in terms of the excess burden and marginal welfare cost of small increases in the marginal tax rate. For simplicity the tax function is assumed to have a single marginal rate applied above an income threshold. The results are extended in Section 4 to allow for the situation in which some income is shifted into an alternative form which faces a lower marginal tax rate. Section 5 introduces the more realistic multi-rate income tax function and shows that the results of the previous sections apply directly to the top marginal tax rate in such a structure. Section 5 then extends those results to deal with tax rates in any of the income brackets in the multi-rate function. Expressions for optimal rates are examined in section 6. The discussion emphasises the treatment of value judgements in this context.

All the results in sections 3 to 6 apply to small changes: this is appropriate when considering optimal rates, for which an equi-marginal condition applies whereby the marginal benefits of a small increase in a tax rate (arising from the extra expenditure financed by the revenue increase) must be equal to the marginal cost (in terms of the weight attached to welfare changes), as perceived by an independent judge. The marginal excess burden per dollar of revenue raised (the marginal welfare cost) clearly becomes infinitely large as the tax rate (within any income band) approaches its revenue-maximising rate, in view of the fact that the change in revenue (the denominator) becomes zero. However, the total excess burden as a proportion of the total tax raised remains finite for rates beyond this point. In some policy contexts the total efficiency loss arising from the tax is of primary interest rather than the marginal excess burden. In other contexts efficiency losses from discrete changes in tax rates are relevant. Section 7 thus extends results to deal with the welfare changes arising from non-marginal tax rate changes. Brief conclusions are in section 8.

Notes

  • [1] Saez et al. (2012) survey a vast literature on the elasticity of taxable income, and an introduction to some of the basic analytics can be found in Creedy (2010).
  • [2] However, in the case of the linear income tax, Tuomala (1985) gives some elegant results which can also be expressed in terms of easily interpreted elasticities: for extensions, including comparisons with majority-voting outcomes, see also Creedy (2008).

2 The Basic Specification

In the literature on the elasticity of taxable income, a constant-elasticity reduced-form specification is ubiquitous, yet its derivation is seldom discussed explicitly. It is therefore useful to begin by stating clearly the nature of the required assumptions concerning individuals' utility functions and budget constraints. Let c denote net income and z gross taxable income, with z0 the value of income in the absence of taxation. Consider the quasi-linear form with parameter, η:

 

with budget constraint:

 

Here μ is virtual income [3] and τ is the marginal tax rate. This is associated with a tax function of the form, for z > a, where a is a tax-free threshold:

 

and T(z) = 0 for z

 

Setting this equal to zero and solving for z gives:

 

and the elasticity of taxable income,

 

, is constant at η. It is the linear term in c, and the absence of μ from the term involving z, which ensures that income effects are zero. It must be acknowledged that the (largely untested) assumption of no income effects is made mainly for pragmatic reasons, as it considerably simplifies the analysis. [4] Income effects would mean, for example, that the behaviour of higher-rate taxpayers in a multi-rate structure changes in response to tax rate changes in lower-tax brackets.

 

Notes

  • [3] In a diagram of the budget constraint, with net income (consumption) on the vertical axis and gross income on the horizontal axis, virtual income is the intercept on the vertical axis. In a multi-rate tax schedule virtual income refers to the extension to the vertical axis of the particular segment under consideration.
  • [4] The Cobb-Douglas case, which is usually so convenient, produces a much more awkward expression for z: see Creedy (2010, p. 564).

 

3 Marginal Revenue and Welfare Changes

This section examines the revenue and welfare effects of small changes in the marginal tax rate for the simple linear tax function. It is shown in section 5 that these results can be directly applied to the top rate in a multi-rate structure.

3.1 Marginal Revenue Changes

The tax paid by an individual is given, for the simple tax function discussed above and for z > a, by τ (z - a). Total revenue collected is thus:

 

Let zT denote the arithmetic mean of those above the threshold, and NT the number of people above the threshold. Then total revenue becomes:

 

The effect on R of a small change in τ, denoted MR, is:

 

Equation 8.

The first term is a pure 'tax rate' effect while the second term is a 'tax base' effect of the tax rate change. These effects are also sometimes referred to as 'mechanical' and 'behavioural' effects respectively. This terminology is used, for example, by Saez et al. (2012). In Creedy and Gemmell (2013) the tax base effect is decomposed further and shown to depend on the elasticity of taxable income and the revenue elasticity. Writing (8) in elasticity form gives:

 

Equation 9.

where the mechanical effect,

 

, and the revenue elasticity,

 

, are both partial elasticities. The term

 

is the elasticity of (average) taxable income with respect to the marginal tax rate. For this tax structure, it can be seen that ηR,zT = zT / (¯zT - a) and ηR,τ = 1. In terms of the elasticity of taxable income, ηzT,1-τ, ηR,τ becomes:

Equation.

Equation.

Equation.

 

 

Equation 10  .

Differentiating (7) with respect to τ and zT gives:

 

Equation 11  .

 

Equation 12   .

and:

 

Equation 13  .

Hence, marginal revenue becomes:

 

Equation 14  .

Define the term α as the ratio of average income, zT , obtained by those above the threshold, a, divided by the average income measured in excess of the threshold, so that:

 

Equation 15  .

This is also the total income of those above the threshold divided by the total income measured in excess of the threshold. From above, it is also known that this is the same as the revenue elasticity, ηR,zT, at zT . Hence (14) is more succinctly written as:

 

Equation 16  .

The elasticity of tax revenue with respect to the tax rate, ηR,τ, is thus:

 

Equation 17  .

The tax rate, τ*, which maximises revenue, obtained by setting dR / dτ = 0, is thus a simple function of α and the elasticity, η, whereby:

 

Equation 18  .

Thus the revenue change in (14) depends on the precise form of the distribution of declared income and the income threshold above which the tax rate, τ, applies.

3.2 Marginal Welfare Changes

Consider the marginal welfare change arising from a small change in the marginal tax rate, τ. Let E(τ,U ) denote the expenditure function, expressed in terms of virtual income, μ, where individual subscripts have been omitted. Hence E(τ,U) is the minimum virtual income required to achieve a given level of utility, U, for a given tax rate, τ. For the equivalent variation, EV , the welfare change resulting from a change in the tax rate from τ1 to τ2, where the change in the tax rate has a dual effect of changing the 'price' and the virtual income, is defined, using subscripts to denote appropriate values of U and μ and omitting individual subscripts, as [5]:

 

Equation 19.

The first term is the 'price effect' and the second term is the 'income effect' of the tax change, and E(τ2,U2) = μ2. For small changes this can be written as:

 

Equation 20.

Using Shephard's Lemma (the Envelope theorem), it is known that ∂E(τ,U) / ∂τ = zH, where the superscript indicates that it is the Hicksian, or compensated, ‘demand'. In the present context, income effects are absent so that Marshallian and Hicksian demands are equal for each individual. Hence, ∂E(τ,U) / ∂τ = z.

Furthermore, from the budget constraint defined above, μ = aτ, and so dμ / dτ = a for all individuals above the threshold. Hence the welfare change is simply:

 

Equation 21.

This welfare change is equivalent to ∂R / ∂τ for each individual taxpayer, which is the tax-rate, or mechanical, effect on revenue of a change in τ. Adding these changes over all those above a gives the aggregate welfare change per taxpayer as:

 

Equation 22  .

The marginal excess burden per taxpayer, MEB = EV -MR, arising from the tax is found by dividing (16) by NT and subtracting the result from (22) to give:

 

Equation 23  .

The MEB is thus equal to the absolute value of the tax-base, or behavioural, effect on tax revenue of a rate change.[6] The total marginal welfare cost per dollar of extra revenue, MWC, is defined as the aggregate marginal excess burden divided by the change in aggregate tax revenue. This is:

 

Equation  24 .

This expression is relevant only when the marginal tax rate is below the revenue-maximising rate given in (18), so that dR / dτ > 0. The MWC initially rises slowly as τ increases, for low values of τ. Then as τ approaches the value for which dR / dτ = 0, the MWC increases rapidly for further tax rate increases. At the tax rate for which dR / dτ = 0, no extra revenue can be obtained from a small increase in the tax rate and so the marginal welfare cost per dollar of extra revenue is clearly infinitely large. Sometimes this expression for MWC is used to compute its value for increasing values of τ, holding η and α constant. The latter obviously relies on the assumption that the ratio

 

remains constant; that is, it is independent of the tax structure. This property holds only for Pareto distributions. For alternative income distributions, the value of α is likely to change as zT - itself a function of τ - changes. The MWC can also be expressed in terms of the two elasticities - the elasticity of taxable income and the revenue elasticity, as follows:

Equation.

 

 

Equation 25  .
Figure 1: Revenue Elasticity and Marginal Welfare Cost Variations

 

Figure 1 - Revenue Elasticity and Marginal Welfare Cost Variations.

Illustrative examples of the variation in the revenue elasticity, ηR,τ, and the marginal welfare cost, as the tax rate increases, are shown in Figure 1, for values of ηz,1-τ = 0.8 and the ratio of average income of those above the threshold to the that average measured in excess of the threshold, α = 1.8. The top section of the figure shows how the revenue elasticity falls as the tax rate increases, with a revenue-maximising value of τ = 0.41 when ηR,τ = 0. The marginal welfare cost increases extremely rapidly as the tax rate approaches its revenue-maximising value. Lower values of both ηz,1-τ and α cause both curves to shift to the right as the revenue-maximising rate increases.

Notes

  • [5] On welfare changes and associated concepts, see Creedy (1998a).
  • [6] This is probably the source of a misunderstanding, regarding the comment by Brewer et al. (2010, p. 61) that, 'A tax change that would have been revenue neutral in the absence of a reduction in work effort will instead produce a revenue loss. It is the size of this revenue loss that determines the [marginal] ”excess burden” of taxation'. In his review, Feldstein (2012, p. 782)) criticised this comment, interpreting the revenue change, to which Brewer et al. alluded, as the total change in revenue, rather than only the behavioural component.

4 The Effect of Income Shifting

The previous discussion has assumed that the disincentive effect of taxation involves a reduction in taxable income that is also the same as gross income. However, a proportion, s, of income that would otherwise be obtained, or reported, may be shifted into another source, where it is taxed at a lower rate, t

 

Equation 26  .

Thus, the imposition of income tax at the rate, τ, means that a proportion, s, of the income reduction, z0 -z, is taxed at the rate, t. The individual's optimisation problem is thus to maximise utility, as in (1), subject to the budget constraint whereby net income, c, is given by:

 

Equation 27  .

This can be written as:

 

Equation  28 .

Virtual income thus becomes (aτ - stz0) and the tax rate becomes (τ - st). As before, substituting for c in the utility function, setting dU / dz = 0 and solving for z gives:

 

Equation  29 .

The solution for z therefore takes the constant elasticity form, as above, but with the rate, τ, replaced by the effective tax rate τ -st. [7]

The tax-rate, or mechanical, effect on revenue of a marginal increase in τ is, as before, ∂R / ∂τ = z - a, while dz / dτ = z0η{1- (τ - st)}η-1 and ∂R / ∂z = τ -st. Hence the tax-base, or behavioural, effect of an increase in τ is given by:

 

Equation  30 .

This is, as shown above, the same as the excess burden, so that the marginal welfare cost of a small increase in τ, following the same steps as before, becomes:

 

Equation 31  .

This is clearly the same as the earlier result for s = 0, but with τ replaced by τ -st. [8]

Notes

  • [7] This clearly raises problems for the estimation of η, since s cannot be observed. Estimation is beyond the scope of the present paper.
  • [8] Saez et al. (2012, p.11) give an incorrect expression, by not recognising that in this case τ must be replaced by τ - st in the solution for z. An incorrect form is also given in Creedy (2010, p. 572), which also contains a printing error, and Claus et al. (2012, p. 301), who follow Saez et al.

5 A Multi-Rate Tax Structure

The previous sections have considered the case of a tax structure having a single rate applied to income measured above a tax-free threshold. The present section extends the results to the more realistic multi-rate structure that is widely used in practice. Subsection 5.1 describes the tax structure and shows that the results in previous sections can be interpreted as simply applying to the top rate in a multi-rate structure. Subsection 5.2 considers marginal revenue and welfare changes for intermediate rates.

5.1 The Tax Function

Consider the multi-step tax function, which is defined by a set of income thresholds, ak, for k = 1,...,K, and marginal income tax rates, τk, applying in tax brackets, that is between adjacent thresholds ak and ak+1. The function can be written as:

 

Equation  32 .

and so on. If z falls into the kth tax bracket, so that akk+1, T(z) can be written for k ≥ 2 as:

 

Equation  33 .

Letting

 

this becomes T(z) = τk(z - ak)+bk.[9] Hence for an individual whose income falls into the kth tax bracket, the budget constraint in (2) becomes:

Equation.

 

 

Equation  34 .

and the virtual income, μ, is simply reduced by the term bk. This means that all the above results can be applied directly to the top rate in a multi-rate structure. Importantly, references to tax revenue must all refer to revenue collected at the top marginal rate only. The assumption that the top tail of the distribution can be approximated by the Pareto distribution is clearly more reasonable in this context. The above results can easily be extended to the case of any tax rate in a multi-rate structure, as follows.

Notes

  • [9]This expression for T(z) can be rewritten as T(z) = τk(z- a*k) where

     

    and τ0 = 0. Thus the tax function facing any individual taxpayer in the kth bracket is equivalent to a tax function with a single marginal tax rate, τk, applied to income measured in excess of a single threshold, ak*. Therefore, unlike a j, ak* differs across individuals depending on the marginal income tax bracket into which they fall.

    Equation

5.2 Changes in Intermediate Tax Rates

In order to consider changes in lower tax rates, rather than the top rate, it is sufficient here to consider a two-rate structure, where the rate τL applies to incomes between the income thresholds aL and aH. Let NL denote the number of people in the first tax bracket and NH the number in the top bracket. [10] Let RτL denote the total tax revenue raised at the rate τL, that is only from income that falls into the lower bracket, for which aLH. The higher-rate payers must pay τL on an amount, aH - aL, of their income, so that:

 

Equation 35.

where zL is the arithmetic mean income of those who fall into the tax bracket with the marginal rate of τL. The corresponding marginal revenue, using

 

, is:

Equation

 

 

Equation 36.

where in this case:

 

Equation 37.

From earlier results the aggregate marginal excess burden is:

 

Equation  38.

and the marginal welfare cost is thus found to be:

 

Equation  39.

where:

 

Equation  40.

The expression for the marginal welfare cost of raising the lower tax rate is thus the same as for the top tax rate, with the addition of the term D in the denominator.

The rate that maximises revenue from the rate τL is given by:

 

Equation  41.

which clearly reduces to τ* = (1+ ηα)-1 for the top tax rate, as obtained above in (18). Furthermore, the interpretation of the term in curly brackets in (41) is the same as that of α for the earlier result: it is the ratio of total income of those who fall into the relevant bracket to the total income that is taxed at the relevant marginal rate.

Notes

  • [11] In general, of course, NH can refer to all those in higher-rate brackets than the one being considered.

6 Optimal Tax Rates

The use of a reduced-form expression for taxable income in terms of the marginal tax rate means that it is also possible to express optimal tax rates in terms of the elasticity of taxable income, using the above results. In the 'standard' optimal tax literature stemming from Mirrlees (1971), in the context of a tax and transfer system where labour supply is endogenous, a starting point is a social welfare, or evaluation, function expressed in terms of individuals' utilities. This welfare function reflects the value judgements of an independent judge, in particular regarding the judge's aversion to inequality. The judge selects the tax rate (for example, the single rate in a linear tax function) to maximise the welfare function, while individuals select their labour supply to maximise utility. The value of a transfer payment is determined by the need to satisfy a government budget constraint. This budget constraint may involve a requirement to raise a given amount of non-transfer expenditure per person (rather than considering a 'pure' transfer system), but the optimal tax models usually consider this as involving a 'black hole', in that the benefits of the resulting expenditure do not enter either individuals' utility functions or the welfare function of the judge. It is well known that in general numerical simulation methods must be used to obtain results. [11] However, for this structural model, the value judgements of the judge, the nature of the tax and transfer system, and the government's budget constraint are entirely transparent.

In the present context a social welfare function is not fully specified but a judge is assumed to take a view about the value of additional government tax-financed expenditure resulting from the extra revenue from a small tax increase. [12] The additional expenditure is not explicitly divided into transfer and other expenditure. Given that a reduced-form model of individual behaviour is used, neither component of this expenditure is considered to enter the utility functions. The independent judge also forms a view about the weight attached to the loss of welfare resulting from the small tax increase. The loss of welfare is expressed as in previous sections above.

6.1 First-order Conditions in a Multi-rate Structure

The approach involves considering each tax bracket in turn; hence decisions regarding income thresholds are supposed already to have been made. [13] The value judgements of the judge are reflected in two terms. The social marginal valuation, SMV, reflects the weight attached to the loss of welfare suffered by those in the relevant tax bracket as a result of a small tax increase. The marginal value of public funds, MV PF, is the value attributed by the judge to the extra tax-financed expenditure resulting from the small tax increase. The optimal tax rate in the bracket is that rate for which (in the view of the judge) the marginal benefit of a further tax increase just matches the marginal cost. Hence the first-order condition for each tax bracket is:

 

Equation 42  .

The left-hand side of (42) is the marginal cost, while the right-hand side is the marginal benefit of the tax increase. The previous sections have expressed the efficiency cost of a marginal tax increase in terms of the marginal excess burden per dollar of extra revenue, the MWC. Thus it is useful to convert this ‘equi-marginal’ condition into one that involves the MWC. First, rewrite (42) as:

 

Equation  43 .

and since by definition:

 

Equation 44  .

this first-order condition becomes:

 

Equation  45 .

Public tax-financed projects may be subject to decreasing marginal valuation by the judge, and the valuation, SMV , may well depend on the tax bracket being considered.

For example, consider the simplest case above, where income is not shifted to lower-taxed sources (so that s = 0) and the rate being examined is the top rate in a multi-tax structure. Let g denote the reciprocal of

 

. The term g therefore represents the weight attached (by a judge) to the welfare loss divided by the weight attached to the extra expenditure financed by the tax change. Substituting for MWC =

 

from (24) and re-arranging (45) gives the optimal rate as:

MV PF | -SMV--|.

--ηατ---- 1- τ(1+ηα).

 

 

Equation  46 .

Furthermore, substituting for α gives the alternative expression:

 

Equation   47.

The term

 

measures the ratio of the total income of those in the top tax bracket to the total income that is subject to the top tax rate. In the extreme case where the judge does not care about top-rate taxpayers, g = 0 and the optimal rate is the same as the rate which maximises revenue from those taxpayers. However, this is a closed-form solution only in the case where g is considered to be constant (that is, independent of the tax rate), otherwise the precise form of g(τ) must be known.

--NT¯zT-- NT(¯zT-a).

 

Consider the optimal value for a lower marginal tax rate, τL, in the two-rate structure considered earlier (and which is easily extended to the multi-rate form). From above, this must satisfy:

 

Equation  48 .

which can be solved to give:

 

Equation 49  .

Another way to express this is:

 

Equation  50 .

The term NLzL is the income of those in the relevant tax bracket, while the term NH(aH - aL) + NL(¯zL - aL) measures the income to which the rate τL is applied. Hence the expression for the optimal rate corresponds precisely with that given in (47) for the optimal top marginal rate.

Notes

  • [11] For references to special cases where explicit solutions are available, and an approximation in the case of the linear income tax, see Creedy (2009).
  • [12] Perhaps understandably, the report in Mirrlees (2011) often conflates the two approaches, suggesting that the use of reduced-form elasticities, allowing income adjustment in addition to labour supply incentive effects, is in the Mirrlees tradition. The common ground is of course a concept of an optimum, based on value judgements, an allowance for incentive effects, and the ability to express the optimum in terms of an equi-marginal condition.
  • [13] A more general approach in which the tax rate can vary continuously over the whole income range is discussed in Saez (2001) and in Brewer et al. (2010). The present approach is adopted for simplicity.

6.2 Comparison with Earlier Results

Instead of writing the optimal condition in terms of the marginal welfare cost, Saez (2001) expressed the condition for the optimal rate using the decomposition of marginal revenue into mechanical and behavioural terms, M and B respectively (where of course B is negative). Write (8) as MR = M + B and from (22), it is known that EV = M. Hence rearranging (42) as Mg = M + B gives the first-order condition as M(1 - g) + B = 0, which is the form given in Saez (2001, p. 210), who does not state (42) explicitly. For the revenue-maximising rate, MR = 0 and M = -B: Brewer et al. (2010, p. 102) thus refer to this rate as 'balancing mechanical and behavioural effects'.

When discussing optimal rates Brewer et al. (2010) write the condition, using current notation, as M + B -gM = 0. In their discussion, the value of MV PF is implicitly set at 1.[14] Hence the term gM is effectively (M) (SMV ) and as EV = M this is the change in 'social welfare' resulting from a small tax rate change (that is, the left hand side of the optimal rate condition in (42)). In their own notation, Brewer et al. write mechanical and behavioural effects on revenue as dM and dB respectively, and they write the social welfare change, -gM, as dW. Thus their condition is written as dM + dB + dW = 0. In their discussion of appropriate settings for the value of g, they therefore consider only the variation in the welfare loss.

Notes

  • [14] An allusion to this is later made in Brewer et al. (2010, p. 166, n. 75).

6.3 Imposing Value Judgements

The role of professional economists, following the famous statement by Robbins (1935), is to examine the implications of adopting alternative value judgements. In the present context this means examining the effects on optimal tax rates of alternative values of the ratio g = SMV / MV PF. The question therefore arises of how to interpret different orders of magnitude.

In the branch of optimal tax literature that follows the structural modelling approach of Mirrlees (1971), it is usual to consider the independent judge as selecting tax rates which maximise the value of a particular social welfare function, expressed in terms of individuals' utilities. The objective is thus entirely transparent and it is clear that interpersonal comparisons of utility are explicitly being made by the judge. Although this allows for a range of types of welfare function, the most common form to be examined is the additive, individualistic, Paretean, and Utilitarian form with constant relative inequality aversion, ε, so that W =

 

.[15] The exercise then becomes one of examining the effects of using different values of ε, and in interpreting orders of magnitude it is useful to consider the well-known 'leaky bucket' experiment. [16] Within this framework, the optimal tax rate depends not only on the form of the social welfare function but also on the cardinalisation used for individuals' utility functions, although this is usually given less attention. [17]

Equation.

 

The general structural approach can also deal with 'non-welfarist' social welfare functions, in which the judge does not evaluate outcomes in terms of things that matter directly to the individuals involved (such as their utility) but in terms of, for example, some aggregate poverty measure, or the number of non-workers.

In the present context of using the elasticity of taxable income in a reduced-form model, the choice of alternative values of g is less straightforward. Little guidance is given by Saez (2001), Brewer et al. (2010) and Mirrlees (2011). As mentioned earlier, Brewer et al. (2010) set the value of MV PF equal to one, and concentrate on discussing values of SMV. [18] The Mirrlees (2011) report gives most emphasis to the revenue maximising rate in the top tax bracket, which (2011, p. 65) is 'equivalent to placing a zero value on their (marginal) welfare'. [19] Different judges may be concerned more explicitly with the question of how the tax revenue is spent: in the structural approach there is an explicit transfer payment and some non-transfer expenditure, the amount of which is considered to have previously been determined and which does not enter individuals' utilities. [20]

One approach might be to suppose that the judge envisages a tax and transfer system and applies an evaluation function of the form,

 

, that is a weighted sum over the K tax brackets of average taxable income in each bracket. Consider the tax rate in the kth bracket, where transfers are assumed to go to those in the 1st bracket (and with no income effects there are no consequences for the behaviour of those in the 1st bracket). The (absolute) slope of the ‘social indifference curve' relating zk and z1 values for which social welfare is unchanged is thus (z¯1 / z¯k)ε. For example, if the tax rate is being considered in a bracket for which the average income is twice that in the lowest tax bracket, and ε = 1, then g = 0.5. A lower value of ε = 0.5 gives g = 0.71 while a higher value of ε = 2 gives g = 0.25. Of course, a difficulty here is that the incomes are themselves endogenous and the link with utility is not straightforward.

equation.

 

Table 1: Examples of Optimal Rates: New Zealand Thresholds and Income Distribution 2010
aL NL zL η g = SMV / MV PF τopt
0 807.04 6.948 0.2 0.996 0.126
14 1687.21 27.079 0.5 0.880 0.207
48 462.37 57.546 0.5 0.455 0.331
70 347.84 115.419 0.6 0.075 0.378

To illustrate the use of the results, Table 1 is based on the New Zealand taxable income distribution for 2010, with the 2009-1010 income thresholds. As the exercise is purely illustrative, only income taxation is considered and hence no account of taxable welfare benefits or indirect taxes, and their effect on overall effective tax rates, is taken. The first column gives the lower income threshold (in thousands of dollars) for each tax band, while columns headed NL and zL show, again in units of thousands, the number of individuals in each bracket and the arithmetic mean income respectively. The column headed η gives assumed values of the elasticity of taxable income within each bracket: these are hypothetical but are based on the results of Claus et al. (2012). Imposed values of g = SMV / MV PF and the implied optimal rates are shown in the final two columns of the table. The final column may be compared with the actual New Zealand rates of, respectively: 0.125; 0.21; 0.33 and 0.38. Hence, the values of g were obtained in each case following a trial and error search process such that the optimal rate matches the actual rate very closely.

Figure 2: Variation in g with z

 

Figure 2 - Variation in g with z   .

It can be seen that the values of g, required for the optimal rate schedule to replicate the actual tax rates, fall rapidly. These are plotted in Figure 2. The top marginal rate of 0.38 could be said to be consistent with the value judgement that places very little value on the marginal welfare loss of those in the top bracket: indeed, the revenue-maximising rate in this bracket is 0.396. [21] For the other tax brackets, the rates that maximise revenue are (from brackets one to three respectively) equal to 0.973, 0.685 and 0.476. These are in each case substantially higher than the actual rates. The revenue-maximising tax rate for the lowest tax bracket is of course very high because most peoples' incomes do not fall into that bracket, so there is very little effect on taxable income (in view of the assumed absence of income effects, whereby only the marginal rate matters).

Notes

  • [15] The 'classical utilitarian' form - which was in fact the one considered in Mirrlees's original paper - is of course simply the sum of individuals' utilities (that is, inequality aversion is zero).
  • [16] This involves considering taking a $1 from one person and deciding how much one is prepared to lose (from the leaky bucket) in making a transfer to a poorer person. With this welfare function, and incomes of the rich and poor individuals as zR and zP respectively, a judge would tolerate a leak of 1 -(ZP/ZR)ε from the initial $1 taken from zR.
  • [17] See Creedy (1998b), where the use of money metric utility is explored; this is of course a particular cardinalisation which is invariant with respect to monotonic transformations of utility but does depend on the choice of 'reference prices'.
  • [18] However, they persistently refer to society's or government's views about inequality. In view of well-known problems relating to the aggregation of preferences, the 'social welfare function' instead must be interpreted as representing the value judgements of a single independent person.
  • [19] This led to Feldstein's (2012, p. 783) question, 'what kind of nation places no value on the welfare of those with income in the top bracket, treating them as the revenue producing property of the state?' and comment that 'many noneconomists would find the Review's suggestion that a society could disregard the welfare of any group of taxpayers repugnant'. Here it is of course important to distinguish between the value attached to the total welfare of those in the top bracket and the value attached to a marginal reduction in welfare. It is the latter to which SMV , and hence g, applies.
  • [20] However, some authors have investigated the implications of allowing public good expenditure to influence individuals' labour supply decisions.
  • [21] Using the constant inequality aversion approach described above, 2 = 0.0036, which gives an optimal rate of 0.395. A parameter of 2 implies a very high value of inequality aversion: the judge would be prepared to take $1 from a person on average income in the top bracket, give less than half a cent to an average person in the lower bracket and throw away the rest.

7 Welfare and Non-marginal Tax Changes

Instead of considering small changes, it is useful to be able to evaluate the welfare changes associated with a given tax rate compared with a no-tax situation or, more often, to evaluate the effect of a significant (non marginal) change. In practice, many tax reforms cannot be considered to involve ‘marginal' changes in tax rates. An example involves the introduction of a new top marginal rate in a multi-rate structure. In order to evaluate such changes, it is necessary to consider the precise form of the expenditure function in (19).

7.1 The Expenditure Function

As above, an individual's expenditure function, E(τ,U), is defined here as the minimum virtual income required to achieve a given level of utility, U, for a given tax rate, τ. To derive the expenditure function, first obtain indirect utility, V , as a function of μ and τ, by substituting the optimal values (5) and (2) into (1) to get:

 

Equation  51 .

As before, z0 represents income in the absence of taxation (that is, τ = 0). In this case is therefore easy to solve for μ in terms of V and τ. Then replacing virtual income, μ, with E(τ,U) and V with U gives: [2]

 

Equation 53  .

Substituting into (19) gives, for an increase in τ from τ1 to τ2:

 

Equation 53  .

 

Equation  54 .

and:

 

Equation  55 .

The change in revenue from a non-marginal tax rate change is:

 

Equation  56 .

and:

 

Equation  57 .

Using μ = at, the term a(τ2 - τ1) is equal to μ2 - μ1.

Notes

  • [22] Using this result, Shephard's Lemma referred to above is easily confirmed, whereby ∂E / ∂τ = z.

7.2 The case where τ1 = 0

Consider first the case where τ1 = 0 and τ2 = τ. That is, consider the welfare change from the introduction of a tax, rather than a change in the tax rate. Then setting τ1 = 0 in (54):

 

Equation 57  .

and:

 

Equation 58  .

Hence the excess burden, EB, is:

 

Equation 59  .

Using z0 = z(1 - τ) this becomes:

 

Equation 60  .

The excess burden per taxpayer (that is for those NT people with z > a) is thus obtained from (60) by replacing z with zT . The tax revenue per taxpayer is τ(¯zT - a). Hence the welfare cost per person, the excess burden per dollar of revenue, now denoted simply as WC, is:

 

Equation 61  .

As before, in the Pareto case α = zT / (¯zT - a) is constant.

Figure 3: Welfare Cost of Taxation

 

Figure 3 - Welfare Cost of Taxation   .

An example of the variation in the welfare cost as the tax rate increases is shown in Figure 3, again for values of ηz,1-τ = 0.8 and α = 1.8 as in the earlier examples. Clearly, the total welfare cost of the tax per dollar of revenue continues to increase steadily beyond the point of maximum revenue.

7.3 An Increase from τ1 to τ2

For a non-marginal change in the tax rate, (54) and (55) give, where ΔEB is written instead of MEB to indicate that discrete changes are considered:

 

Equation 62  .

The discrete change in the welfare cost, denoted ΔWC, is equal to the change in the total excess burden per dollar of extra revenue, τ2(¯z2 - a) - τ1(¯z1 - a). Hence in terms of the cost per person (replacing z values in (62) with corresponding averages):

 

Equation 63  .

Furthermore:

 

Equation  64 .

and writing α = z2 / (¯z2 - a), then a / z2 = 1 - 1α-. It can be seen that by letting τ1 = 0, this result reduces to (61).

8 Conclusions

The aim of this paper has been to provide a technical introduction to the use of the elasticity of taxable income in welfare comparisons and optimal tax discussions. This concept is now widely used in discussions of income tax policy, although a number of the results and assumptions are not entirely transparent in the literature. Using a consistent framework and notation, a number of established results concerning marginal welfare changes and optimal tax rates are derived. In addition, some new results relating to non-marginal tax changes, which are often relevant in practice, are presented.

It is particularly important to be able to consider the relevant value judgements used, so that the sources of policy disagreements can more easily be identified. Attention was given to the way value judgements enter into the calculation of optimal tax rates using the elasticity of taxable income measure, where they are somewhat less explicit than in the context of structural models which maximise a specified social welfare, or evaluation, function.

It was stated in the introduction that the use of the reduced-form concept of the elasticity of taxable income allows some strong results to be obtained in terms that are perhaps ‘more concrete' than the results from structural models of optimal income taxation. However, it has also been seen that the results come at a cost of some strong simplifying assumptions. One point that perhaps needs stressing here is that the elasticity of taxable income, even within a model having a constant elasticity, is not in fact a fixed parameter but depends on many elements of the tax structure including, in particular, the ease of shifting income between sources. This leads to a distinction between optimal tax rates and optimal tax structures, where the latter includes things like the ease of becoming incorporated, establishing family trusts, and so on.

Both structural and reduced-form models are clearly highly simplified, both in terms of the economic environment and the behaviour of individuals. Neither approach can of course be expected to provide detailed practical policy advice. However, they can both be used, in their different ways, to illuminate and clarify different aspects of the complex relationships involved in choosing a tax rate structure.

References

Brewer, M., E. Saez, and A. Shephard (2010, Apr). Means testing and tax rates on earnings.

Claus, I., J. Creedy, and J. Teng (2012). The elasticity of taxable income in New Zealand. Fiscal Studies 33(3), 287-303.

Creedy, J. (1998a). Measuring Welfare Changes and Tax Burdens. Edward Elgar Pub.

Creedy, J. (1998b). The optimal linear income tax model: Utility or equivalent income? Scottish Journal of Political Economy 45(1), 99-110.

Creedy, J. (2008). Choosing the tax rate in a linear income tax structure. Australian Journal of Labour Economics (AJLE) 11(3), 257-276.

Creedy, J. (2009). Explicit solutions for the optimal linear income tax rate. Australian Economic Papers 48, 224-236.

Creedy, J. (2010). Elasticity of taxable income: an introduction and some basic analytics. Public Finance and Management 10, 556-589.

Creedy, J. and N. Gemmell (2013). Measuring revenue responses to tax rate changes in multi-rate income tax systems: Behavioural and structural factors. International Tax and Public Finance (forthcoming).

Feldstein, M. (1995). The effect of marginal tax rates on taxable income: A panel study of the 1986 tax reform act. Journal of Political Economy 103(3), 551-72.

Feldstein, M. (2012). The Mirrlees review. Journal of Economic Literature 50(3), 781-90.

Mirrlees, J. (1971). An exploration in the theory of optimal income taxation. Review of Economic Studies 38, 175-208.

Mirrlees, J. (2010). Dimensions of tax design. Oxford: Oxford University Press for the Institute for Fiscal Studies..

Mirrlees, J. (2011). Tax by design. Oxford: Oxford University Press for the Institute for Fiscal Studies..

Robbins, L. C. (1935). An essay on the nature and significance of economic science (2. ed., rev. and extended ed.). London: Macmillan.

Saez, E. (2001). Using elasticities to derive optimal income tax rates. Review of Economic Studies 68(1), 205-29.

Saez, E., J. Slemrod, and S. H. Giertz (2012). The elasticity of taxable income with respect to marginal tax rates: A critical review. Journal of Economic Literature 50(1), 3-50.

Tuomala, M. (1985). Simplified formulae for optimal linear income taxation. Scandinavian Journal of Economics 87, 668-672.