Abstract
This paper provides a technical introduction to the use of the elasticity of taxable income in welfare comparisons and optimal tax discussions. It draws together, using a consistent framework and notation, a number of established results concerning marginal welfare changes and optimal taxes. Particular attention is given to the way value judgements can be specified when using this approach, and results are illustrated using the New Zealand income tax. In addition, some new results, particularly in terms of nonmarginal tax changes, are presented.
Acknowledgements
I am very grateful to Martin Keene for many detailed comments, including pointing out an error in an earlier draft. I have also benefited from comments and suggestions by Simon Carey, Norman Gemmell, Jose SanzSanz and Michael Freudenberg.
Disclaimer
The views, opinions, findings, and conclusions or recommendations expressed in this Working Paper are strictly those of the author. They do not necessarily reflect the views of the New Zealand Treasury or the New Zealand Government. The New Zealand Treasury and the New Zealand Government take no responsibility for any errors or omissions in, or for the correctness of, the information contained in these working papers. The paper is presented not as policy, but with a view to inform and stimulate wider debate.
Executive Summary
The concept of the elasticity of taxable income has become widely used in both the positive literature on the behavioural incentive effects of income taxation and in the normative literature on welfare effects and optimal taxation. This elasticity is defined as the elasticity of taxable income with respect to the netoftax rate (one minus the marginal tax rate), and is therefore positive. The attractions are that its use eliminates the need to construct and estimate a fully specified structural model of taxpayers' behaviour, and optimal tax rates can be readily discussed and expressed explicitly in terms of the elasticity of taxable income. These advantages nevertheless come at a cost, in terms of the difficulties of empirical estimation and the strong underlying assumptions required to generate some of the results.
The aim of the present paper is to provide a technical introduction to the use of the elasticity of taxable income in welfare comparisons and optimal tax discussions. It draws together, using a consistent framework and notation, a number of established results concerning marginal welfare changes and optimal taxes. In addition, it presents some new results, particularly in terms of nonmarginal tax changes.
In the ‘standard' optimal tax literature, in the context of a tax and transfer system where labour supply responds to tax change, a starting point is a social welfare, or evaluation, function expressed in terms of individuals' utilities. This welfare function reflects the value judgements of an independent judge, in particular regarding the judge's aversion to inequality. In the present context a social welfare function is not fully specified but a judge is assumed to take a view about the value of additional government taxfinanced expenditure resulting from the extra revenue from a small tax increase. The additional expenditure is not explicitly divided into transfer and other expenditure. The independent judge also forms a view about the weight attached to the loss of welfare resulting from the small tax increase.
The main concepts required here are the social marginal valuation, SMV , which reflects the weight attached to the loss of welfare suffered by those in the relevant tax bracket as a result of a small tax increase, and the marginal value of public funds, MV PF, which is the value attributed by the judge to the extra taxfinanced expenditure resulting from the small tax increase.
Examining the implications of adopting alternative value judgements involves examining the effects on optimal tax rates of alternative values of the ratio SMV / MV PF. The question therefore arises of how to interpret different orders of magnitude. The paper seeks to make assumptions more transparent.
Both the standard optimal tax approach and the use of the elasticity of taxable income involve the use of highly simplified models, both in terms of the economic environment and the behaviour of individuals. Neither approach can of course be expected to provide detailed practical policy advice. However, they can both be used, in their different ways, to illuminate and clarify different aspects of the complex relationships involved in choosing a tax rate structure.
1 Introduction
The concept of the elasticity of taxable income has, following Feldstein (1995), become widespread in both the positive literature on the behavioural incentive effects of income taxation and in the normative literature on welfare effects and optimal taxation. [1] This elasticity is defined as the elasticity of taxable income with respect to the netoftax rate (one minus the marginal tax rate), and is therefore positive. For example, much use was made of this elasticity, following the contribution of Saez (2001), in the report chaired by Sir James Mirrlees, which consisted of two substantial volumes (Mirrlees, 2010, 2011) produced under the aegis of the Institute for Fiscal Studies in London. The attraction is obvious: the use of a reducedform approach eliminates the need to construct and estimate a fully specified structural model of taxpayers' behaviour, and optimal tax rates can be readily discussed and expressed explicitly in terms of the elasticity of taxable income. Such elasticities are ‘bread and butter' to economists and may be regarded as being more 'concrete' than the elements which enter into the determinants of optimal tax rates in the types of structural labour supply model which followed the earlier work of Mirrlees (1971). [2] These advantages nevertheless come at a cost, in terms of the difficulties of empirical estimation and the strong underlying assumptions required to generate some of the results.
The aim of the present paper is to provide a technical introduction to the use of the elasticity of taxable income in welfare comparisons and optimal tax discussions. It draws together, using a consistent framework and notation, a number of established results concerning marginal welfare changes and optimal taxes, in addition to presenting some new results, particularly in terms of nonmarginal tax changes.
Section 2 introduces the quasilinear utility function that is implicit in all studies which use a constant elasticity specification in which there are no income effects on taxable income of marginal tax rate changes. Section 3 discusses the revenue and welfare effects, measured in terms of the excess burden and marginal welfare cost of small increases in the marginal tax rate. For simplicity the tax function is assumed to have a single marginal rate applied above an income threshold. The results are extended in Section 4 to allow for the situation in which some income is shifted into an alternative form which faces a lower marginal tax rate. Section 5 introduces the more realistic multirate income tax function and shows that the results of the previous sections apply directly to the top marginal tax rate in such a structure. Section 5 then extends those results to deal with tax rates in any of the income brackets in the multirate function. Expressions for optimal rates are examined in section 6. The discussion emphasises the treatment of value judgements in this context.
All the results in sections 3 to 6 apply to small changes: this is appropriate when considering optimal rates, for which an equimarginal condition applies whereby the marginal benefits of a small increase in a tax rate (arising from the extra expenditure financed by the revenue increase) must be equal to the marginal cost (in terms of the weight attached to welfare changes), as perceived by an independent judge. The marginal excess burden per dollar of revenue raised (the marginal welfare cost) clearly becomes infinitely large as the tax rate (within any income band) approaches its revenuemaximising rate, in view of the fact that the change in revenue (the denominator) becomes zero. However, the total excess burden as a proportion of the total tax raised remains finite for rates beyond this point. In some policy contexts the total efficiency loss arising from the tax is of primary interest rather than the marginal excess burden. In other contexts efficiency losses from discrete changes in tax rates are relevant. Section 7 thus extends results to deal with the welfare changes arising from nonmarginal tax rate changes. Brief conclusions are in section 8.
Notes
 [1] Saez et al. (2012) survey a vast literature on the elasticity of taxable income, and an introduction to some of the basic analytics can be found in Creedy (2010).
 [2] However, in the case of the linear income tax, Tuomala (1985) gives some elegant results which can also be expressed in terms of easily interpreted elasticities: for extensions, including comparisons with majorityvoting outcomes, see also Creedy (2008).
2 The Basic Specification
In the literature on the elasticity of taxable income, a constantelasticity reducedform specification is ubiquitous, yet its derivation is seldom discussed explicitly. It is therefore useful to begin by stating clearly the nature of the required assumptions concerning individuals' utility functions and budget constraints. Let c denote net income and z gross taxable income, with z_{0} the value of income in the absence of taxation. Consider the quasilinear form with parameter, η:
with budget constraint:
Here μ is virtual income [3] and τ is the marginal tax rate. This is associated with a tax function of the form, for z > a, where a is a taxfree threshold:
and T(z) = 0 for z
Setting this equal to zero and solving for z gives:
and the elasticity of taxable income,
, is constant at η. It is the linear term in c, and the absence of μ from the term involving z, which ensures that income effects are zero. It must be acknowledged that the (largely untested) assumption of no income effects is made mainly for pragmatic reasons, as it considerably simplifies the analysis. [4] Income effects would mean, for example, that the behaviour of higherrate taxpayers in a multirate structure changes in response to tax rate changes in lowertax brackets.
Notes
 [3] In a diagram of the budget constraint, with net income (consumption) on the vertical axis and gross income on the horizontal axis, virtual income is the intercept on the vertical axis. In a multirate tax schedule virtual income refers to the extension to the vertical axis of the particular segment under consideration.
 [4] The CobbDouglas case, which is usually so convenient, produces a much more awkward expression for z: see Creedy (2010, p. 564).
3 Marginal Revenue and Welfare Changes
This section examines the revenue and welfare effects of small changes in the marginal tax rate for the simple linear tax function. It is shown in section 5 that these results can be directly applied to the top rate in a multirate structure.
3.1 Marginal Revenue Changes
The tax paid by an individual is given, for the simple tax function discussed above and for z > a, by τ (z  a). Total revenue collected is thus:
Let z_{T } denote the arithmetic mean of those above the threshold, and N_{T } the number of people above the threshold. Then total revenue becomes:
The effect on R of a small change in τ, denoted MR, is:
The first term is a pure 'tax rate' effect while the second term is a 'tax base' effect of the tax rate change. These effects are also sometimes referred to as 'mechanical' and 'behavioural' effects respectively. This terminology is used, for example, by Saez et al. (2012). In Creedy and Gemmell (2013) the tax base effect is decomposed further and shown to depend on the elasticity of taxable income and the revenue elasticity. Writing (8) in elasticity form gives:
where the mechanical effect,
, and the revenue elasticity,
, are both partial elasticities. The term
is the elasticity of (average) taxable income with respect to the marginal tax rate. For this tax structure, it can be seen that η_{R,zT} = z_{T } / (¯zT  a) and η_{R,τ}^{′} = 1. In terms of the elasticity of taxable income, η_{zT,1τ}, η_{R,τ} becomes:
Differentiating (7) with respect to τ and z_{T } gives:
and:
Hence, marginal revenue becomes:
Define the term α as the ratio of average income, z_{T }, obtained by those above the threshold, a, divided by the average income measured in excess of the threshold, so that:
This is also the total income of those above the threshold divided by the total income measured in excess of the threshold. From above, it is also known that this is the same as the revenue elasticity, η_{R,zT}, at z_{T }. Hence (14) is more succinctly written as:
The elasticity of tax revenue with respect to the tax rate, η_{R,τ}, is thus:
The tax rate, τ^{*}, which maximises revenue, obtained by setting dR / dτ = 0, is thus a simple function of α and the elasticity, η, whereby:
Thus the revenue change in (14) depends on the precise form of the distribution of declared income and the income threshold above which the tax rate, τ, applies.
3.2 Marginal Welfare Changes
Consider the marginal welfare change arising from a small change in the marginal tax rate, τ. Let E(τ,U ) denote the expenditure function, expressed in terms of virtual income, μ, where individual subscripts have been omitted. Hence E(τ,U) is the minimum virtual income required to achieve a given level of utility, U, for a given tax rate, τ. For the equivalent variation, EV , the welfare change resulting from a change in the tax rate from τ_{1} to τ_{2}, where the change in the tax rate has a dual effect of changing the 'price' and the virtual income, is defined, using subscripts to denote appropriate values of U and μ and omitting individual subscripts, as [5]:
The first term is the 'price effect' and the second term is the 'income effect' of the tax change, and E(τ_{2},U_{2}) = μ_{2}. For small changes this can be written as:
Using Shephard's Lemma (the Envelope theorem), it is known that ∂E(τ,U) / ∂τ = z^{H}, where the superscript indicates that it is the Hicksian, or compensated, ‘demand'. In the present context, income effects are absent so that Marshallian and Hicksian demands are equal for each individual. Hence, ∂E(τ,U) / ∂τ = z.
Furthermore, from the budget constraint defined above, μ = aτ, and so dμ / dτ = a for all individuals above the threshold. Hence the welfare change is simply:
This welfare change is equivalent to ∂R / ∂τ for each individual taxpayer, which is the taxrate, or mechanical, effect on revenue of a change in τ. Adding these changes over all those above a gives the aggregate welfare change per taxpayer as:
The marginal excess burden per taxpayer, MEB = EV MR, arising from the tax is found by dividing (16) by N_{T } and subtracting the result from (22) to give:
The MEB is thus equal to the absolute value of the taxbase, or behavioural, effect on tax revenue of a rate change.[6] The total marginal welfare cost per dollar of extra revenue, MWC, is defined as the aggregate marginal excess burden divided by the change in aggregate tax revenue. This is:
This expression is relevant only when the marginal tax rate is below the revenuemaximising rate given in (18), so that dR / dτ > 0. The MWC initially rises slowly as τ increases, for low values of τ. Then as τ approaches the value for which dR / dτ = 0, the MWC increases rapidly for further tax rate increases. At the tax rate for which dR / dτ = 0, no extra revenue can be obtained from a small increase in the tax rate and so the marginal welfare cost per dollar of extra revenue is clearly infinitely large. Sometimes this expression for MWC is used to compute its value for increasing values of τ, holding η and α constant. The latter obviously relies on the assumption that the ratio
remains constant; that is, it is independent of the tax structure. This property holds only for Pareto distributions. For alternative income distributions, the value of α is likely to change as z_{T }  itself a function of τ  changes. The MWC can also be expressed in terms of the two elasticities  the elasticity of taxable income and the revenue elasticity, as follows:
 Figure 1: Revenue Elasticity and Marginal Welfare Cost Variations

Illustrative examples of the variation in the revenue elasticity, η_{R},_{τ}, and the marginal welfare cost, as the tax rate increases, are shown in Figure 1, for values of η_{z,1τ} = 0.8 and the ratio of average income of those above the threshold to the that average measured in excess of the threshold, α = 1.8. The top section of the figure shows how the revenue elasticity falls as the tax rate increases, with a revenuemaximising value of τ = 0.41 when η_{R},_{τ} = 0. The marginal welfare cost increases extremely rapidly as the tax rate approaches its revenuemaximising value. Lower values of both η_{z,1τ} and α cause both curves to shift to the right as the revenuemaximising rate increases.
Notes
 [5] On welfare changes and associated concepts, see Creedy (1998a).
 [6] This is probably the source of a misunderstanding, regarding the comment by Brewer et al. (2010, p. 61) that, 'A tax change that would have been revenue neutral in the absence of a reduction in work effort will instead produce a revenue loss. It is the size of this revenue loss that determines the [marginal] ”excess burden” of taxation'. In his review, Feldstein (2012, p. 782)) criticised this comment, interpreting the revenue change, to which Brewer et al. alluded, as the total change in revenue, rather than only the behavioural component.
4 The Effect of Income Shifting
The previous discussion has assumed that the disincentive effect of taxation involves a reduction in taxable income that is also the same as gross income. However, a proportion, s, of income that would otherwise be obtained, or reported, may be shifted into another source, where it is taxed at a lower rate, t
Thus, the imposition of income tax at the rate, τ, means that a proportion, s, of the income reduction, z_{0} z, is taxed at the rate, t. The individual's optimisation problem is thus to maximise utility, as in (1), subject to the budget constraint whereby net income, c, is given by:
This can be written as:
Virtual income thus becomes (aτ  stz_{0}) and the tax rate becomes (τ  st). As before, substituting for c in the utility function, setting dU / dz = 0 and solving for z gives:
The solution for z therefore takes the constant elasticity form, as above, but with the rate, τ, replaced by the effective tax rate τ st. [7]
The taxrate, or mechanical, effect on revenue of a marginal increase in τ is, as before, ∂R / ∂τ = z  a, while dz / dτ = z_{0}η{1 (τ  st)}^{η1} and ∂R / ∂z = τ st. Hence the taxbase, or behavioural, effect of an increase in τ is given by:
This is, as shown above, the same as the excess burden, so that the marginal welfare cost of a small increase in τ, following the same steps as before, becomes:
This is clearly the same as the earlier result for s = 0, but with τ replaced by τ st. [8]
Notes
 [7] This clearly raises problems for the estimation of η, since s cannot be observed. Estimation is beyond the scope of the present paper.
 [8] Saez et al. (2012, p.11) give an incorrect expression, by not recognising that in this case τ must be replaced by τ  st in the solution for z. An incorrect form is also given in Creedy (2010, p. 572), which also contains a printing error, and Claus et al. (2012, p. 301), who follow Saez et al.
5 A MultiRate Tax Structure
The previous sections have considered the case of a tax structure having a single rate applied to income measured above a taxfree threshold. The present section extends the results to the more realistic multirate structure that is widely used in practice. Subsection 5.1 describes the tax structure and shows that the results in previous sections can be interpreted as simply applying to the top rate in a multirate structure. Subsection 5.2 considers marginal revenue and welfare changes for intermediate rates.
5.1 The Tax Function
Consider the multistep tax function, which is defined by a set of income thresholds, a_{k}, for k = 1,...,K, and marginal income tax rates, τ_{k}, applying in tax brackets, that is between adjacent thresholds a_{k} and a_{k+1}. The function can be written as:
and so on. If z falls into the kth tax bracket, so that a_{k}k+1, T(z) can be written for k ≥ 2 as:
Letting
this becomes T(z) = τ_{k}(z  a_{k})+b_{k}.[9] Hence for an individual whose income falls into the kth tax bracket, the budget constraint in (2) becomes:
and the virtual income, μ, is simply reduced by the term b_{k}. This means that all the above results can be applied directly to the top rate in a multirate structure. Importantly, references to tax revenue must all refer to revenue collected at the top marginal rate only. The assumption that the top tail of the distribution can be approximated by the Pareto distribution is clearly more reasonable in this context. The above results can easily be extended to the case of any tax rate in a multirate structure, as follows.
Notes
 [9]This expression for T(z) can be rewritten as T(z) = τ_{k}(z a*_{k}) where
5.2 Changes in Intermediate Tax Rates
In order to consider changes in lower tax rates, rather than the top rate, it is sufficient here to consider a tworate structure, where the rate τ_{L} applies to incomes between the income thresholds a_{L} and a_{H}. Let N_{L} denote the number of people in the first tax bracket and N_{H} the number in the top bracket. [10] Let R_{τL} denote the total tax revenue raised at the rate τ_{L}, that is only from income that falls into the lower bracket, for which a_{L}H. The higherrate payers must pay τ_{L} on an amount, a_{H}  a_{L}, of their income, so that:
where z_{L} is the arithmetic mean income of those who fall into the tax bracket with the marginal rate of τ_{L}. The corresponding marginal revenue, using
, is:
where in this case:
From earlier results the aggregate marginal excess burden is:
and the marginal welfare cost is thus found to be:
where:
The expression for the marginal welfare cost of raising the lower tax rate is thus the same as for the top tax rate, with the addition of the term D in the denominator.
The rate that maximises revenue from the rate τ_{L} is given by:
which clearly reduces to τ^{*} = (1+ ηα)^{1} for the top tax rate, as obtained above in (18). Furthermore, the interpretation of the term in curly brackets in (41) is the same as that of α for the earlier result: it is the ratio of total income of those who fall into the relevant bracket to the total income that is taxed at the relevant marginal rate.
Notes
 [11] In general, of course, N_{H} can refer to all those in higherrate brackets than the one being considered.
6 Optimal Tax Rates
The use of a reducedform expression for taxable income in terms of the marginal tax rate means that it is also possible to express optimal tax rates in terms of the elasticity of taxable income, using the above results. In the 'standard' optimal tax literature stemming from Mirrlees (1971), in the context of a tax and transfer system where labour supply is endogenous, a starting point is a social welfare, or evaluation, function expressed in terms of individuals' utilities. This welfare function reflects the value judgements of an independent judge, in particular regarding the judge's aversion to inequality. The judge selects the tax rate (for example, the single rate in a linear tax function) to maximise the welfare function, while individuals select their labour supply to maximise utility. The value of a transfer payment is determined by the need to satisfy a government budget constraint. This budget constraint may involve a requirement to raise a given amount of nontransfer expenditure per person (rather than considering a 'pure' transfer system), but the optimal tax models usually consider this as involving a 'black hole', in that the benefits of the resulting expenditure do not enter either individuals' utility functions or the welfare function of the judge. It is well known that in general numerical simulation methods must be used to obtain results. [11] However, for this structural model, the value judgements of the judge, the nature of the tax and transfer system, and the government's budget constraint are entirely transparent.
In the present context a social welfare function is not fully specified but a judge is assumed to take a view about the value of additional government taxfinanced expenditure resulting from the extra revenue from a small tax increase. [12] The additional expenditure is not explicitly divided into transfer and other expenditure. Given that a reducedform model of individual behaviour is used, neither component of this expenditure is considered to enter the utility functions. The independent judge also forms a view about the weight attached to the loss of welfare resulting from the small tax increase. The loss of welfare is expressed as in previous sections above.
6.1 Firstorder Conditions in a Multirate Structure
The approach involves considering each tax bracket in turn; hence decisions regarding income thresholds are supposed already to have been made. [13] The value judgements of the judge are reflected in two terms. The social marginal valuation, SMV, reflects the weight attached to the loss of welfare suffered by those in the relevant tax bracket as a result of a small tax increase. The marginal value of public funds, MV PF, is the value attributed by the judge to the extra taxfinanced expenditure resulting from the small tax increase. The optimal tax rate in the bracket is that rate for which (in the view of the judge) the marginal benefit of a further tax increase just matches the marginal cost. Hence the firstorder condition for each tax bracket is:
The lefthand side of (42) is the marginal cost, while the righthand side is the marginal benefit of the tax increase. The previous sections have expressed the efficiency cost of a marginal tax increase in terms of the marginal excess burden per dollar of extra revenue, the MWC. Thus it is useful to convert this ‘equimarginal’ condition into one that involves the MWC. First, rewrite (42) as:
and since by definition:
this firstorder condition becomes:
Public taxfinanced projects may be subject to decreasing marginal valuation by the judge, and the valuation, SMV , may well depend on the tax bracket being considered.
For example, consider the simplest case above, where income is not shifted to lowertaxed sources (so that s = 0) and the rate being examined is the top rate in a multitax structure. Let g denote the reciprocal of
. The term g therefore represents the weight attached (by a judge) to the welfare loss divided by the weight attached to the extra expenditure financed by the tax change. Substituting for MWC =
from (24) and rearranging (45) gives the optimal rate as:
Furthermore, substituting for α gives the alternative expression:
The term
measures the ratio of the total income of those in the top tax bracket to the total income that is subject to the top tax rate. In the extreme case where the judge does not care about toprate taxpayers, g = 0 and the optimal rate is the same as the rate which maximises revenue from those taxpayers. However, this is a closedform solution only in the case where g is considered to be constant (that is, independent of the tax rate), otherwise the precise form of g(τ) must be known.
Consider the optimal value for a lower marginal tax rate, τ_{L}, in the tworate structure considered earlier (and which is easily extended to the multirate form). From above, this must satisfy:
which can be solved to give:
Another way to express this is:
The term N_{L}z_{L} is the income of those in the relevant tax bracket, while the term N_{H}(a_{H}  a_{L}) + N_{L}(¯z_{L}  a_{L}) measures the income to which the rate τ_{L} is applied. Hence the expression for the optimal rate corresponds precisely with that given in (47) for the optimal top marginal rate.
Notes
 [11] For references to special cases where explicit solutions are available, and an approximation in the case of the linear income tax, see Creedy (2009).
 [12] Perhaps understandably, the report in Mirrlees (2011) often conflates the two approaches, suggesting that the use of reducedform elasticities, allowing income adjustment in addition to labour supply incentive effects, is in the Mirrlees tradition. The common ground is of course a concept of an optimum, based on value judgements, an allowance for incentive effects, and the ability to express the optimum in terms of an equimarginal condition.
 [13] A more general approach in which the tax rate can vary continuously over the whole income range is discussed in Saez (2001) and in Brewer et al. (2010). The present approach is adopted for simplicity.
6.2 Comparison with Earlier Results
Instead of writing the optimal condition in terms of the marginal welfare cost, Saez (2001) expressed the condition for the optimal rate using the decomposition of marginal revenue into mechanical and behavioural terms, M and B respectively (where of course B is negative). Write (8) as MR = M + B and from (22), it is known that EV = M. Hence rearranging (42) as Mg = M + B gives the firstorder condition as M(1  g) + B = 0, which is the form given in Saez (2001, p. 210), who does not state (42) explicitly. For the revenuemaximising rate, MR = 0 and M = B: Brewer et al. (2010, p. 102) thus refer to this rate as 'balancing mechanical and behavioural effects'.
When discussing optimal rates Brewer et al. (2010) write the condition, using current notation, as M + B gM = 0. In their discussion, the value of MV PF is implicitly set at 1.[14] Hence the term gM is effectively (M) (SMV ) and as EV = M this is the change in 'social welfare' resulting from a small tax rate change (that is, the left hand side of the optimal rate condition in (42)). In their own notation, Brewer et al. write mechanical and behavioural effects on revenue as dM and dB respectively, and they write the social welfare change, gM, as dW. Thus their condition is written as dM + dB + dW = 0. In their discussion of appropriate settings for the value of g, they therefore consider only the variation in the welfare loss.
Notes
 [14] An allusion to this is later made in Brewer et al. (2010, p. 166, n. 75).
6.3 Imposing Value Judgements
The role of professional economists, following the famous statement by Robbins (1935), is to examine the implications of adopting alternative value judgements. In the present context this means examining the effects on optimal tax rates of alternative values of the ratio g = SMV / MV PF. The question therefore arises of how to interpret different orders of magnitude.
In the branch of optimal tax literature that follows the structural modelling approach of Mirrlees (1971), it is usual to consider the independent judge as selecting tax rates which maximise the value of a particular social welfare function, expressed in terms of individuals' utilities. The objective is thus entirely transparent and it is clear that interpersonal comparisons of utility are explicitly being made by the judge. Although this allows for a range of types of welfare function, the most common form to be examined is the additive, individualistic, Paretean, and Utilitarian form with constant relative inequality aversion, ε, so that W =
.[15] The exercise then becomes one of examining the effects of using different values of ε, and in interpreting orders of magnitude it is useful to consider the wellknown 'leaky bucket' experiment. [16] Within this framework, the optimal tax rate depends not only on the form of the social welfare function but also on the cardinalisation used for individuals' utility functions, although this is usually given less attention. [17]
The general structural approach can also deal with 'nonwelfarist' social welfare functions, in which the judge does not evaluate outcomes in terms of things that matter directly to the individuals involved (such as their utility) but in terms of, for example, some aggregate poverty measure, or the number of nonworkers.
In the present context of using the elasticity of taxable income in a reducedform model, the choice of alternative values of g is less straightforward. Little guidance is given by Saez (2001), Brewer et al. (2010) and Mirrlees (2011). As mentioned earlier, Brewer et al. (2010) set the value of MV PF equal to one, and concentrate on discussing values of SMV. [18] The Mirrlees (2011) report gives most emphasis to the revenue maximising rate in the top tax bracket, which (2011, p. 65) is 'equivalent to placing a zero value on their (marginal) welfare'. [19] Different judges may be concerned more explicitly with the question of how the tax revenue is spent: in the structural approach there is an explicit transfer payment and some nontransfer expenditure, the amount of which is considered to have previously been determined and which does not enter individuals' utilities. [20]
One approach might be to suppose that the judge envisages a tax and transfer system and applies an evaluation function of the form,
, that is a weighted sum over the K tax brackets of average taxable income in each bracket. Consider the tax rate in the kth bracket, where transfers are assumed to go to those in the 1st bracket (and with no income effects there are no consequences for the behaviour of those in the 1st bracket). The (absolute) slope of the ‘social indifference curve' relating z_{k} and z_{1} values for which social welfare is unchanged is thus (z¯1 / z¯k)^{ε}. For example, if the tax rate is being considered in a bracket for which the average income is twice that in the lowest tax bracket, and ε = 1, then g = 0.5. A lower value of ε = 0.5 gives g = 0.71 while a higher value of ε = 2 gives g = 0.25. Of course, a difficulty here is that the incomes are themselves endogenous and the link with utility is not straightforward.
a_{L}  N_{L}  z_{L}  η  g = SMV / MV PF  τ_{opt} 

0  807.04  6.948  0.2  0.996  0.126 
14  1687.21  27.079  0.5  0.880  0.207 
48  462.37  57.546  0.5  0.455  0.331 
70  347.84  115.419  0.6  0.075  0.378 
To illustrate the use of the results, Table 1 is based on the New Zealand taxable income distribution for 2010, with the 20091010 income thresholds. As the exercise is purely illustrative, only income taxation is considered and hence no account of taxable welfare benefits or indirect taxes, and their effect on overall effective tax rates, is taken. The first column gives the lower income threshold (in thousands of dollars) for each tax band, while columns headed N_{L} and z_{L} show, again in units of thousands, the number of individuals in each bracket and the arithmetic mean income respectively. The column headed η gives assumed values of the elasticity of taxable income within each bracket: these are hypothetical but are based on the results of Claus et al. (2012). Imposed values of g = SMV / MV PF and the implied optimal rates are shown in the final two columns of the table. The final column may be compared with the actual New Zealand rates of, respectively: 0.125; 0.21; 0.33 and 0.38. Hence, the values of g were obtained in each case following a trial and error search process such that the optimal rate matches the actual rate very closely.
 Figure 2: Variation in g with z

It can be seen that the values of g, required for the optimal rate schedule to replicate the actual tax rates, fall rapidly. These are plotted in Figure 2. The top marginal rate of 0.38 could be said to be consistent with the value judgement that places very little value on the marginal welfare loss of those in the top bracket: indeed, the revenuemaximising rate in this bracket is 0.396. [21] For the other tax brackets, the rates that maximise revenue are (from brackets one to three respectively) equal to 0.973, 0.685 and 0.476. These are in each case substantially higher than the actual rates. The revenuemaximising tax rate for the lowest tax bracket is of course very high because most peoples' incomes do not fall into that bracket, so there is very little effect on taxable income (in view of the assumed absence of income effects, whereby only the marginal rate matters).
Notes
 [15] The 'classical utilitarian' form  which was in fact the one considered in Mirrlees's original paper  is of course simply the sum of individuals' utilities (that is, inequality aversion is zero).
 [16] This involves considering taking a $1 from one person and deciding how much one is prepared to lose (from the leaky bucket) in making a transfer to a poorer person. With this welfare function, and incomes of the rich and poor individuals as z_{R} and z_{P} respectively, a judge would tolerate a leak of 1 (Z_{P}/Z_{R})^{ε} from the initial $1 taken from z_{R}.
 [17] See Creedy (1998b), where the use of money metric utility is explored; this is of course a particular cardinalisation which is invariant with respect to monotonic transformations of utility but does depend on the choice of 'reference prices'.
 [18] However, they persistently refer to society's or government's views about inequality. In view of wellknown problems relating to the aggregation of preferences, the 'social welfare function' instead must be interpreted as representing the value judgements of a single independent person.
 [19] This led to Feldstein's (2012, p. 783) question, 'what kind of nation places no value on the welfare of those with income in the top bracket, treating them as the revenue producing property of the state?' and comment that 'many noneconomists would find the Review's suggestion that a society could disregard the welfare of any group of taxpayers repugnant'. Here it is of course important to distinguish between the value attached to the total welfare of those in the top bracket and the value attached to a marginal reduction in welfare. It is the latter to which SMV , and hence g, applies.
 [20] However, some authors have investigated the implications of allowing public good expenditure to influence individuals' labour supply decisions.
 [21] Using the constant inequality aversion approach described above, 2 = 0.0036, which gives an optimal rate of 0.395. A parameter of 2 implies a very high value of inequality aversion: the judge would be prepared to take $1 from a person on average income in the top bracket, give less than half a cent to an average person in the lower bracket and throw away the rest.
7 Welfare and Nonmarginal Tax Changes
Instead of considering small changes, it is useful to be able to evaluate the welfare changes associated with a given tax rate compared with a notax situation or, more often, to evaluate the effect of a significant (non marginal) change. In practice, many tax reforms cannot be considered to involve ‘marginal' changes in tax rates. An example involves the introduction of a new top marginal rate in a multirate structure. In order to evaluate such changes, it is necessary to consider the precise form of the expenditure function in (19).
7.1 The Expenditure Function
As above, an individual's expenditure function, E(τ,U), is defined here as the minimum virtual income required to achieve a given level of utility, U, for a given tax rate, τ. To derive the expenditure function, first obtain indirect utility, V , as a function of μ and τ, by substituting the optimal values (5) and (2) into (1) to get:
As before, z_{0} represents income in the absence of taxation (that is, τ = 0). In this case is therefore easy to solve for μ in terms of V and τ. Then replacing virtual income, μ, with E(τ,U) and V with U gives: [2]
Substituting into (19) gives, for an increase in τ from τ_{1} to τ_{2}:
and:
The change in revenue from a nonmarginal tax rate change is:
and:
Using μ = at, the term a(τ2  τ1) is equal to μ_{2}  μ_{1}.
Notes
 [22] Using this result, Shephard's Lemma referred to above is easily confirmed, whereby ∂E / ∂τ = z.
7.2 The case where τ_{1} = 0
Consider first the case where τ_{1} = 0 and τ_{2} = τ. That is, consider the welfare change from the introduction of a tax, rather than a change in the tax rate. Then setting τ_{1} = 0 in (54):
and:
Hence the excess burden, EB, is:
Using z_{0} = z(1  τ)^{η} this becomes:
The excess burden per taxpayer (that is for those N_{T } people with z > a) is thus obtained from (60) by replacing z with z_{T }. The tax revenue per taxpayer is τ(¯zT  a). Hence the welfare cost per person, the excess burden per dollar of revenue, now denoted simply as WC, is:
As before, in the Pareto case α = z_{T } / (¯zT  a) is constant.
 Figure 3: Welfare Cost of Taxation

An example of the variation in the welfare cost as the tax rate increases is shown in Figure 3, again for values of η_{z,1τ} = 0.8 and α = 1.8 as in the earlier examples. Clearly, the total welfare cost of the tax per dollar of revenue continues to increase steadily beyond the point of maximum revenue.
7.3 An Increase from τ_{1} to τ_{2}
For a nonmarginal change in the tax rate, (54) and (55) give, where ΔEB is written instead of MEB to indicate that discrete changes are considered:
The discrete change in the welfare cost, denoted ΔWC, is equal to the change in the total excess burden per dollar of extra revenue, τ_{2}(¯z2  a)  τ_{1}(¯z1  a). Hence in terms of the cost per person (replacing z values in (62) with corresponding averages):
Furthermore:
and writing α = z_{2} / (¯z2  a), then a / z_{2} = 1  1α. It can be seen that by letting τ_{1} = 0, this result reduces to (61).
8 Conclusions
The aim of this paper has been to provide a technical introduction to the use of the elasticity of taxable income in welfare comparisons and optimal tax discussions. This concept is now widely used in discussions of income tax policy, although a number of the results and assumptions are not entirely transparent in the literature. Using a consistent framework and notation, a number of established results concerning marginal welfare changes and optimal tax rates are derived. In addition, some new results relating to nonmarginal tax changes, which are often relevant in practice, are presented.
It is particularly important to be able to consider the relevant value judgements used, so that the sources of policy disagreements can more easily be identified. Attention was given to the way value judgements enter into the calculation of optimal tax rates using the elasticity of taxable income measure, where they are somewhat less explicit than in the context of structural models which maximise a specified social welfare, or evaluation, function.
It was stated in the introduction that the use of the reducedform concept of the elasticity of taxable income allows some strong results to be obtained in terms that are perhaps ‘more concrete' than the results from structural models of optimal income taxation. However, it has also been seen that the results come at a cost of some strong simplifying assumptions. One point that perhaps needs stressing here is that the elasticity of taxable income, even within a model having a constant elasticity, is not in fact a fixed parameter but depends on many elements of the tax structure including, in particular, the ease of shifting income between sources. This leads to a distinction between optimal tax rates and optimal tax structures, where the latter includes things like the ease of becoming incorporated, establishing family trusts, and so on.
Both structural and reducedform models are clearly highly simplified, both in terms of the economic environment and the behaviour of individuals. Neither approach can of course be expected to provide detailed practical policy advice. However, they can both be used, in their different ways, to illuminate and clarify different aspects of the complex relationships involved in choosing a tax rate structure.
References
Brewer, M., E. Saez, and A. Shephard (2010, Apr). Means testing and tax rates on earnings.
Claus, I., J. Creedy, and J. Teng (2012). The elasticity of taxable income in New Zealand. Fiscal Studies 33(3), 287303.
Creedy, J. (1998a). Measuring Welfare Changes and Tax Burdens. Edward Elgar Pub.
Creedy, J. (1998b). The optimal linear income tax model: Utility or equivalent income? Scottish Journal of Political Economy 45(1), 99110.
Creedy, J. (2008). Choosing the tax rate in a linear income tax structure. Australian Journal of Labour Economics (AJLE) 11(3), 257276.
Creedy, J. (2009). Explicit solutions for the optimal linear income tax rate. Australian Economic Papers 48, 224236.
Creedy, J. (2010). Elasticity of taxable income: an introduction and some basic analytics. Public Finance and Management 10, 556589.
Creedy, J. and N. Gemmell (2013). Measuring revenue responses to tax rate changes in multirate income tax systems: Behavioural and structural factors. International Tax and Public Finance (forthcoming).
Feldstein, M. (1995). The effect of marginal tax rates on taxable income: A panel study of the 1986 tax reform act. Journal of Political Economy 103(3), 55172.
Feldstein, M. (2012). The Mirrlees review. Journal of Economic Literature 50(3), 78190.
Mirrlees, J. (1971). An exploration in the theory of optimal income taxation. Review of Economic Studies 38, 175208.
Mirrlees, J. (2010). Dimensions of tax design. Oxford: Oxford University Press for the Institute for Fiscal Studies..
Mirrlees, J. (2011). Tax by design. Oxford: Oxford University Press for the Institute for Fiscal Studies..
Robbins, L. C. (1935). An essay on the nature and significance of economic science (2. ed., rev. and extended ed.). London: Macmillan.
Saez, E. (2001). Using elasticities to derive optimal income tax rates. Review of Economic Studies 68(1), 20529.
Saez, E., J. Slemrod, and S. H. Giertz (2012). The elasticity of taxable income with respect to marginal tax rates: A critical review. Journal of Economic Literature 50(1), 350.
Tuomala, M. (1985). Simplified formulae for optimal linear income taxation. Scandinavian Journal of Economics 87, 668672.