Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to deal with high-dimensional system? #40

Closed
Jiaxing-Wang opened this issue Mar 3, 2017 · 7 comments
Closed

How to deal with high-dimensional system? #40

Jiaxing-Wang opened this issue Mar 3, 2017 · 7 comments

Comments

@Jiaxing-Wang
Copy link

I tried to construct the PCE using spectral projection method like this:

distribution = cp.Iid(cp.Normal(0, 1), 39)
nodes, weights = cp.generate_quadrature(order = 1, domain = distribution)

My computer crashed directly, as there are too many parameters. My question is I want to construct a PCE, which has 39 uncertainty parameters. Can I use the spectral projection method? Are there any other way to using spectral projection method to reduce the samples?

I have construct PCE using the linear regression method with 4000 samples like this (P and Y are my samples and predictions):

distribution = cp.Iid(cp.Uniform(0, 1), 39)
polynomial_expansion = cp.orth_ttr(2, distribution)
foo_approx = cp.fit_regression(polynomial_expansion, P, Y)

and the relative accuracy is 1% which is acceptable for me.

I want to compare spectral projection and linear regression. Are they comparable at dealing with the high dimensional question, which has 39 model inputs?

Thank you very much.

@tsilifis
Copy link

tsilifis commented Mar 6, 2017

Hi Jiaxing,

some general comments on your questions above are:

  • Whether your computer will crash or not depends also on its memory and not only on the high dimensionality.
  • Make sure the quadrature rule for which you are generating points and weights is also sparse.
  • The general criterion for testing the accuracy of the PCE is it's L2 error. In the case of course where you don't know output of the true model you can estimate the relative L2 error. I think this is the only way you can compare the two methods with each other. Of course for the linear regression you can compute the R-square statistic to see the goodness-of-fit but that will not really tell you if it's better than the spectral projection. Over linear regression allows you to use less model evaluations but keep in mind that the OLS solution does not take into account the orthogonality between the polynomials.
  • If the order of the PCE you want to construct is very low you might also wanna try estimating the coefficients with Monte Carlo samples.

Best,
Panos

@Jiaxing-Wang
Copy link
Author

Jiaxing-Wang commented Mar 6, 2017

@tsilifis, Thank you for your answer. I understand how to compare different methods.

I have an another question about the sparse gird. I am trying to generate the sparse grid using the code like this:
dimension = 39
distribution = cp.Iid(cp.Uniform(0, 1), dimension )
polynomial_expansion = cp.orth_ttr(2, distribution)
nodes, weights = cp.generate_quadrature(order = 2, domain = distribution, sparse=True,rule = "c")

The memory of my computer is 64g now. However, the generating process using the code above is quite slow. I can not get the results.

Is there anything wrong?

@Jiaxing-Wang
Copy link
Author

Jiaxing-Wang commented Mar 6, 2017

@tsilifis , I can not use a PCE with the order larger than 2. This is also because the dimension of my problem is as large as 39. The unknown parameters for a third order PCE are too large.

@jonathf
Copy link
Owner

jonathf commented Mar 6, 2017

The library for pseudo-spectral projections was never designed to scale smoothly upwards, so the limitations you are experiencing is not much of a surprises. Non-sparse order 3 -> 4^39=302231454903657293676544 samples. Sparse helps a few orders of magnitude, but that is still quite a bit of samples to handle even if the generator wasn't struggling.

The method cp.fit_quadrature takes the arguments nodes, weights, evals, which only need to be numpy complient. I.e. If you have a quadrature rule that works well on joint 39-dimensional i.i.d. Gaussian random variable, you are good to go. (If you are aware of any well functioning high-dimensional quadrature schemes out there, let me know.)

A naive approach that will work, is to use Monte Carlo integration: Generate random samples from your distributions, use weights = 1./len(samples). Not nearly as accurate, but technically still pseudo-spectral projection.

Side note:
The rule="C" is Clenshaw-Curtis quadrature, which works poorly on infinite-interval distributions like the Gaussian you are using. I might recommend using something like Gauss-Patterson rule="P" instead.

@Jiaxing-Wang
Copy link
Author

Jiaxing-Wang commented Mar 7, 2017

@jonathf, Thanks a lot.

I try the method as you said. P presets the input samples which is generated using sobol's sequence and Y represent the model outputs:

distribution = cp.Iid(cp.Uniform(0, 1), P.shape[0])
polynomial_expansion = cp.orth_ttr(2, distribution)
foo_approx = cp.fit_quadrature(polynomial_expansion,P,1./len(Y)*np.ones(len(Y)),Y)

The convergence of this method is much slower compared to the point collocation method.
image

Can I draw a conclusion that the point collocation is better than pseudo-spectral projection method, when the dimension of the input is very large?

@jonathf
Copy link
Owner

jonathf commented Mar 7, 2017

Yes, you can safely conclude that PC is better than PSP in higher dimensions.

@Jiaxing-Wang
Copy link
Author

Jiaxing-Wang commented Mar 8, 2017

@jonathf Thank you very much.

I get my answer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants