Ordinary Least Squares Regression
Clash Royale CLAN TAG#URR8PPP
up vote
5
down vote
favorite
I am quite surprised that a variant of linear regression has been proposed for a challenge, whereas an estimation via ordinary least squares regression has not, despite the fact the this is arguably the most widely used method in applied economics, biology, psychology, and social sciences!
For details, check out the Wikipedia page on OLS. To keep things concise, suppose one has a model:
$$Y=beta_0+beta_1X_1+beta_2X_2+dots+beta_kX_k+U$$
where all right-hand side X variablesâÂÂregressorsâÂÂare linearly independent and are assumed to be exogenous (or at least uncorrelated) to the model error U. Then you solve the problem
$$min_beta_0,dots,beta_ksum_i=1^nleft(Y_i-beta_0-beta_1X_1-dots-beta_kX_kright)^2$$
on a sample of size n, given observations
$$left(beginmatrixY_1&X_1,1&cdots&X_k,1\vdots&vdots&ddots&vdots\Y_n&X_1,n&cdots&X_k,nendmatrixright)$$
The OLS solution to this problem as a vector looks like
$$hatbeta=(textbfX'textbfX)^-1(textbfX'textbfY)$$
where Y is the first column of the input matrix and X is a matrix made of a column of ones and remaining columns. This solution can be obtained via many numerical methods (matrix inversion, QR decomposition, Cholesky decomposition etc.), so pick your favourite!
Of course, econometricians prefer slightly different notation, but letâÂÂs just ignore them.
Non other than Gauss himself is watching you from the skies, so do not disappoint one of the greatest mathematicians of all times and write the shortest code possible.
Task
Given the observations in a matrix form as shown above, estimate the coefficients of the linear regression model via OLS.
Input
A matrix of values. The first column is always Y[1], ..., Y[n]
, the second column is X1[1], ..., X1[n]
, the next one is X2[1], ..., X2[n]
etc. The column of ones is not given (as in real datasets), but you have to add it first in order to estimate beta0
(as in real models).
NB. In statistics, regressing on a constant is widely used. This means that the model is Y = b0 + U
, and the OLS estimate of b0
is the sample average of Y
. In this case, the input is just a matrix with one column, Y
, and it is regressed on a column of ones.
You can safely assume that the variables are not exactly collinear, so the matrices above are invertible. In case your language cannot invert matrices with condition numbers larger than a certain threshold, state it explicitly, and provide a unique return value (that cannot be confused with output in case of success) denoting that the system seems to be computationally singular (S
or any other unambiguous characters).
Output
OLS estimates of beta_0, ..., beta_k
from a linear regression model in an unambiguous format or an indication that your language could not solve the system.
Challenge rules
- I/O formats are flexible. A matrix can be several lines of space-delimited numbers separated by newlines, or an array of row vectors, or an array of column vectors etc.
- This is code-golf, so shortest answer in bytes wins.
- Built-in functions are allowed as long as they are tweaked to produce a solution given the input matrix as an argument. That is, the two-byte answer
lm
in R is not a valid answer because it expects a different kind of input.
Standard rules apply for your answer, so you are allowed to use STDIN/STDOUT, functions/method with the proper parameters and return-type, full programs.
Default loopholes are forbidden.
Test cases
[[4,5,6],[1,2,3]]
â output:[3,1]
. Explanation: X = [[1,1,1], [1,2,3]], X'X = [[3,6],[6,14]], inverse(X'X) = [[2.333,-1],[-1,0.5]], X'Y = [15, 32], and inverse(X'X)â X'Y=[3, 1][[5.5,4.1,10.5,7.7,6.6,7.2],[1.9,0.4,5.6,3.3,3.8,1.7],[4.2,2.2,3.2,3.2,2.5,6.6]]
â output:[2.1171050,1.1351122,0.4539268]
.[[1,-2,3,-4],[1,2,3,4],[1,2,3,4.000001]]
â output:[-1.3219977,6657598.3906250,-6657597.3945312]
orS
(any code for a computationally singular system).[[1,2,3,4,5]]
â output:3
.
Bonus points (to your karma, not to your byte count) if your code can solve very ill-conditioned (quasi-multicollinear) problems (e. g. if you throw in an extra zero in the decimal part of X2[4]
in Test Case 3) with high precision.
For fun: you can implement estimation of standard errors (WhiteâÂÂs sandwich form or the simplified homoskedastic version) or any other kind of post-estimation diagnostics to amuse yourself if your language has built-in tools for that (I am looking at you, R, Python and Julia users). In other words, if your language allows to do cool stuff with regression objects in few bytes, you can show it!
code-golf matrix statistics
 |Â
show 6 more comments
up vote
5
down vote
favorite
I am quite surprised that a variant of linear regression has been proposed for a challenge, whereas an estimation via ordinary least squares regression has not, despite the fact the this is arguably the most widely used method in applied economics, biology, psychology, and social sciences!
For details, check out the Wikipedia page on OLS. To keep things concise, suppose one has a model:
$$Y=beta_0+beta_1X_1+beta_2X_2+dots+beta_kX_k+U$$
where all right-hand side X variablesâÂÂregressorsâÂÂare linearly independent and are assumed to be exogenous (or at least uncorrelated) to the model error U. Then you solve the problem
$$min_beta_0,dots,beta_ksum_i=1^nleft(Y_i-beta_0-beta_1X_1-dots-beta_kX_kright)^2$$
on a sample of size n, given observations
$$left(beginmatrixY_1&X_1,1&cdots&X_k,1\vdots&vdots&ddots&vdots\Y_n&X_1,n&cdots&X_k,nendmatrixright)$$
The OLS solution to this problem as a vector looks like
$$hatbeta=(textbfX'textbfX)^-1(textbfX'textbfY)$$
where Y is the first column of the input matrix and X is a matrix made of a column of ones and remaining columns. This solution can be obtained via many numerical methods (matrix inversion, QR decomposition, Cholesky decomposition etc.), so pick your favourite!
Of course, econometricians prefer slightly different notation, but letâÂÂs just ignore them.
Non other than Gauss himself is watching you from the skies, so do not disappoint one of the greatest mathematicians of all times and write the shortest code possible.
Task
Given the observations in a matrix form as shown above, estimate the coefficients of the linear regression model via OLS.
Input
A matrix of values. The first column is always Y[1], ..., Y[n]
, the second column is X1[1], ..., X1[n]
, the next one is X2[1], ..., X2[n]
etc. The column of ones is not given (as in real datasets), but you have to add it first in order to estimate beta0
(as in real models).
NB. In statistics, regressing on a constant is widely used. This means that the model is Y = b0 + U
, and the OLS estimate of b0
is the sample average of Y
. In this case, the input is just a matrix with one column, Y
, and it is regressed on a column of ones.
You can safely assume that the variables are not exactly collinear, so the matrices above are invertible. In case your language cannot invert matrices with condition numbers larger than a certain threshold, state it explicitly, and provide a unique return value (that cannot be confused with output in case of success) denoting that the system seems to be computationally singular (S
or any other unambiguous characters).
Output
OLS estimates of beta_0, ..., beta_k
from a linear regression model in an unambiguous format or an indication that your language could not solve the system.
Challenge rules
- I/O formats are flexible. A matrix can be several lines of space-delimited numbers separated by newlines, or an array of row vectors, or an array of column vectors etc.
- This is code-golf, so shortest answer in bytes wins.
- Built-in functions are allowed as long as they are tweaked to produce a solution given the input matrix as an argument. That is, the two-byte answer
lm
in R is not a valid answer because it expects a different kind of input.
Standard rules apply for your answer, so you are allowed to use STDIN/STDOUT, functions/method with the proper parameters and return-type, full programs.
Default loopholes are forbidden.
Test cases
[[4,5,6],[1,2,3]]
â output:[3,1]
. Explanation: X = [[1,1,1], [1,2,3]], X'X = [[3,6],[6,14]], inverse(X'X) = [[2.333,-1],[-1,0.5]], X'Y = [15, 32], and inverse(X'X)â X'Y=[3, 1][[5.5,4.1,10.5,7.7,6.6,7.2],[1.9,0.4,5.6,3.3,3.8,1.7],[4.2,2.2,3.2,3.2,2.5,6.6]]
â output:[2.1171050,1.1351122,0.4539268]
.[[1,-2,3,-4],[1,2,3,4],[1,2,3,4.000001]]
â output:[-1.3219977,6657598.3906250,-6657597.3945312]
orS
(any code for a computationally singular system).[[1,2,3,4,5]]
â output:3
.
Bonus points (to your karma, not to your byte count) if your code can solve very ill-conditioned (quasi-multicollinear) problems (e. g. if you throw in an extra zero in the decimal part of X2[4]
in Test Case 3) with high precision.
For fun: you can implement estimation of standard errors (WhiteâÂÂs sandwich form or the simplified homoskedastic version) or any other kind of post-estimation diagnostics to amuse yourself if your language has built-in tools for that (I am looking at you, R, Python and Julia users). In other words, if your language allows to do cool stuff with regression objects in few bytes, you can show it!
code-golf matrix statistics
What about built-ins, does it have to be an original solver code, or using a built-in is OK?
â Kirill L.
Sep 22 at 10:36
@KirillL. Added this rule: Built-in functions are allowed as long as they are tweaked to produce a solution given the input matrix as an argument. That is, the two-byte answerlm
in R is not a valid answer because it expects a different kind of input.
â Andreï Kostyrka
Sep 22 at 10:45
Well, I guessed that :) Still a bit unsure about I/O though. Is this valid? - takes input as data frame, and returns nothing with debug info on singular case
â Kirill L.
Sep 22 at 11:03
1
@KirillL. Yes, of course! Post it! (Optionally: and watch your clever answer being beaten by someoneâÂÂs Pyth or MATL string of seemingly random characters.) Just specify that the input format is this:as.data.frame(matrix(c(Y,X1,...),n))
wheren
indicates the number of observations.
â Andreï Kostyrka
Sep 22 at 11:08
@AndreïKostyrka Can you clarify your test cases: E.g. in the first example, is[4,5,6]
a column or a row?
â flawr
Sep 22 at 13:38
 |Â
show 6 more comments
up vote
5
down vote
favorite
up vote
5
down vote
favorite
I am quite surprised that a variant of linear regression has been proposed for a challenge, whereas an estimation via ordinary least squares regression has not, despite the fact the this is arguably the most widely used method in applied economics, biology, psychology, and social sciences!
For details, check out the Wikipedia page on OLS. To keep things concise, suppose one has a model:
$$Y=beta_0+beta_1X_1+beta_2X_2+dots+beta_kX_k+U$$
where all right-hand side X variablesâÂÂregressorsâÂÂare linearly independent and are assumed to be exogenous (or at least uncorrelated) to the model error U. Then you solve the problem
$$min_beta_0,dots,beta_ksum_i=1^nleft(Y_i-beta_0-beta_1X_1-dots-beta_kX_kright)^2$$
on a sample of size n, given observations
$$left(beginmatrixY_1&X_1,1&cdots&X_k,1\vdots&vdots&ddots&vdots\Y_n&X_1,n&cdots&X_k,nendmatrixright)$$
The OLS solution to this problem as a vector looks like
$$hatbeta=(textbfX'textbfX)^-1(textbfX'textbfY)$$
where Y is the first column of the input matrix and X is a matrix made of a column of ones and remaining columns. This solution can be obtained via many numerical methods (matrix inversion, QR decomposition, Cholesky decomposition etc.), so pick your favourite!
Of course, econometricians prefer slightly different notation, but letâÂÂs just ignore them.
Non other than Gauss himself is watching you from the skies, so do not disappoint one of the greatest mathematicians of all times and write the shortest code possible.
Task
Given the observations in a matrix form as shown above, estimate the coefficients of the linear regression model via OLS.
Input
A matrix of values. The first column is always Y[1], ..., Y[n]
, the second column is X1[1], ..., X1[n]
, the next one is X2[1], ..., X2[n]
etc. The column of ones is not given (as in real datasets), but you have to add it first in order to estimate beta0
(as in real models).
NB. In statistics, regressing on a constant is widely used. This means that the model is Y = b0 + U
, and the OLS estimate of b0
is the sample average of Y
. In this case, the input is just a matrix with one column, Y
, and it is regressed on a column of ones.
You can safely assume that the variables are not exactly collinear, so the matrices above are invertible. In case your language cannot invert matrices with condition numbers larger than a certain threshold, state it explicitly, and provide a unique return value (that cannot be confused with output in case of success) denoting that the system seems to be computationally singular (S
or any other unambiguous characters).
Output
OLS estimates of beta_0, ..., beta_k
from a linear regression model in an unambiguous format or an indication that your language could not solve the system.
Challenge rules
- I/O formats are flexible. A matrix can be several lines of space-delimited numbers separated by newlines, or an array of row vectors, or an array of column vectors etc.
- This is code-golf, so shortest answer in bytes wins.
- Built-in functions are allowed as long as they are tweaked to produce a solution given the input matrix as an argument. That is, the two-byte answer
lm
in R is not a valid answer because it expects a different kind of input.
Standard rules apply for your answer, so you are allowed to use STDIN/STDOUT, functions/method with the proper parameters and return-type, full programs.
Default loopholes are forbidden.
Test cases
[[4,5,6],[1,2,3]]
â output:[3,1]
. Explanation: X = [[1,1,1], [1,2,3]], X'X = [[3,6],[6,14]], inverse(X'X) = [[2.333,-1],[-1,0.5]], X'Y = [15, 32], and inverse(X'X)â X'Y=[3, 1][[5.5,4.1,10.5,7.7,6.6,7.2],[1.9,0.4,5.6,3.3,3.8,1.7],[4.2,2.2,3.2,3.2,2.5,6.6]]
â output:[2.1171050,1.1351122,0.4539268]
.[[1,-2,3,-4],[1,2,3,4],[1,2,3,4.000001]]
â output:[-1.3219977,6657598.3906250,-6657597.3945312]
orS
(any code for a computationally singular system).[[1,2,3,4,5]]
â output:3
.
Bonus points (to your karma, not to your byte count) if your code can solve very ill-conditioned (quasi-multicollinear) problems (e. g. if you throw in an extra zero in the decimal part of X2[4]
in Test Case 3) with high precision.
For fun: you can implement estimation of standard errors (WhiteâÂÂs sandwich form or the simplified homoskedastic version) or any other kind of post-estimation diagnostics to amuse yourself if your language has built-in tools for that (I am looking at you, R, Python and Julia users). In other words, if your language allows to do cool stuff with regression objects in few bytes, you can show it!
code-golf matrix statistics
I am quite surprised that a variant of linear regression has been proposed for a challenge, whereas an estimation via ordinary least squares regression has not, despite the fact the this is arguably the most widely used method in applied economics, biology, psychology, and social sciences!
For details, check out the Wikipedia page on OLS. To keep things concise, suppose one has a model:
$$Y=beta_0+beta_1X_1+beta_2X_2+dots+beta_kX_k+U$$
where all right-hand side X variablesâÂÂregressorsâÂÂare linearly independent and are assumed to be exogenous (or at least uncorrelated) to the model error U. Then you solve the problem
$$min_beta_0,dots,beta_ksum_i=1^nleft(Y_i-beta_0-beta_1X_1-dots-beta_kX_kright)^2$$
on a sample of size n, given observations
$$left(beginmatrixY_1&X_1,1&cdots&X_k,1\vdots&vdots&ddots&vdots\Y_n&X_1,n&cdots&X_k,nendmatrixright)$$
The OLS solution to this problem as a vector looks like
$$hatbeta=(textbfX'textbfX)^-1(textbfX'textbfY)$$
where Y is the first column of the input matrix and X is a matrix made of a column of ones and remaining columns. This solution can be obtained via many numerical methods (matrix inversion, QR decomposition, Cholesky decomposition etc.), so pick your favourite!
Of course, econometricians prefer slightly different notation, but letâÂÂs just ignore them.
Non other than Gauss himself is watching you from the skies, so do not disappoint one of the greatest mathematicians of all times and write the shortest code possible.
Task
Given the observations in a matrix form as shown above, estimate the coefficients of the linear regression model via OLS.
Input
A matrix of values. The first column is always Y[1], ..., Y[n]
, the second column is X1[1], ..., X1[n]
, the next one is X2[1], ..., X2[n]
etc. The column of ones is not given (as in real datasets), but you have to add it first in order to estimate beta0
(as in real models).
NB. In statistics, regressing on a constant is widely used. This means that the model is Y = b0 + U
, and the OLS estimate of b0
is the sample average of Y
. In this case, the input is just a matrix with one column, Y
, and it is regressed on a column of ones.
You can safely assume that the variables are not exactly collinear, so the matrices above are invertible. In case your language cannot invert matrices with condition numbers larger than a certain threshold, state it explicitly, and provide a unique return value (that cannot be confused with output in case of success) denoting that the system seems to be computationally singular (S
or any other unambiguous characters).
Output
OLS estimates of beta_0, ..., beta_k
from a linear regression model in an unambiguous format or an indication that your language could not solve the system.
Challenge rules
- I/O formats are flexible. A matrix can be several lines of space-delimited numbers separated by newlines, or an array of row vectors, or an array of column vectors etc.
- This is code-golf, so shortest answer in bytes wins.
- Built-in functions are allowed as long as they are tweaked to produce a solution given the input matrix as an argument. That is, the two-byte answer
lm
in R is not a valid answer because it expects a different kind of input.
Standard rules apply for your answer, so you are allowed to use STDIN/STDOUT, functions/method with the proper parameters and return-type, full programs.
Default loopholes are forbidden.
Test cases
[[4,5,6],[1,2,3]]
â output:[3,1]
. Explanation: X = [[1,1,1], [1,2,3]], X'X = [[3,6],[6,14]], inverse(X'X) = [[2.333,-1],[-1,0.5]], X'Y = [15, 32], and inverse(X'X)â X'Y=[3, 1][[5.5,4.1,10.5,7.7,6.6,7.2],[1.9,0.4,5.6,3.3,3.8,1.7],[4.2,2.2,3.2,3.2,2.5,6.6]]
â output:[2.1171050,1.1351122,0.4539268]
.[[1,-2,3,-4],[1,2,3,4],[1,2,3,4.000001]]
â output:[-1.3219977,6657598.3906250,-6657597.3945312]
orS
(any code for a computationally singular system).[[1,2,3,4,5]]
â output:3
.
Bonus points (to your karma, not to your byte count) if your code can solve very ill-conditioned (quasi-multicollinear) problems (e. g. if you throw in an extra zero in the decimal part of X2[4]
in Test Case 3) with high precision.
For fun: you can implement estimation of standard errors (WhiteâÂÂs sandwich form or the simplified homoskedastic version) or any other kind of post-estimation diagnostics to amuse yourself if your language has built-in tools for that (I am looking at you, R, Python and Julia users). In other words, if your language allows to do cool stuff with regression objects in few bytes, you can show it!
code-golf matrix statistics
code-golf matrix statistics
edited Sep 24 at 9:03
asked Sep 22 at 10:11
Andreï Kostyrka
1,179617
1,179617
What about built-ins, does it have to be an original solver code, or using a built-in is OK?
â Kirill L.
Sep 22 at 10:36
@KirillL. Added this rule: Built-in functions are allowed as long as they are tweaked to produce a solution given the input matrix as an argument. That is, the two-byte answerlm
in R is not a valid answer because it expects a different kind of input.
â Andreï Kostyrka
Sep 22 at 10:45
Well, I guessed that :) Still a bit unsure about I/O though. Is this valid? - takes input as data frame, and returns nothing with debug info on singular case
â Kirill L.
Sep 22 at 11:03
1
@KirillL. Yes, of course! Post it! (Optionally: and watch your clever answer being beaten by someoneâÂÂs Pyth or MATL string of seemingly random characters.) Just specify that the input format is this:as.data.frame(matrix(c(Y,X1,...),n))
wheren
indicates the number of observations.
â Andreï Kostyrka
Sep 22 at 11:08
@AndreïKostyrka Can you clarify your test cases: E.g. in the first example, is[4,5,6]
a column or a row?
â flawr
Sep 22 at 13:38
 |Â
show 6 more comments
What about built-ins, does it have to be an original solver code, or using a built-in is OK?
â Kirill L.
Sep 22 at 10:36
@KirillL. Added this rule: Built-in functions are allowed as long as they are tweaked to produce a solution given the input matrix as an argument. That is, the two-byte answerlm
in R is not a valid answer because it expects a different kind of input.
â Andreï Kostyrka
Sep 22 at 10:45
Well, I guessed that :) Still a bit unsure about I/O though. Is this valid? - takes input as data frame, and returns nothing with debug info on singular case
â Kirill L.
Sep 22 at 11:03
1
@KirillL. Yes, of course! Post it! (Optionally: and watch your clever answer being beaten by someoneâÂÂs Pyth or MATL string of seemingly random characters.) Just specify that the input format is this:as.data.frame(matrix(c(Y,X1,...),n))
wheren
indicates the number of observations.
â Andreï Kostyrka
Sep 22 at 11:08
@AndreïKostyrka Can you clarify your test cases: E.g. in the first example, is[4,5,6]
a column or a row?
â flawr
Sep 22 at 13:38
What about built-ins, does it have to be an original solver code, or using a built-in is OK?
â Kirill L.
Sep 22 at 10:36
What about built-ins, does it have to be an original solver code, or using a built-in is OK?
â Kirill L.
Sep 22 at 10:36
@KirillL. Added this rule: Built-in functions are allowed as long as they are tweaked to produce a solution given the input matrix as an argument. That is, the two-byte answer
lm
in R is not a valid answer because it expects a different kind of input.â Andreï Kostyrka
Sep 22 at 10:45
@KirillL. Added this rule: Built-in functions are allowed as long as they are tweaked to produce a solution given the input matrix as an argument. That is, the two-byte answer
lm
in R is not a valid answer because it expects a different kind of input.â Andreï Kostyrka
Sep 22 at 10:45
Well, I guessed that :) Still a bit unsure about I/O though. Is this valid? - takes input as data frame, and returns nothing with debug info on singular case
â Kirill L.
Sep 22 at 11:03
Well, I guessed that :) Still a bit unsure about I/O though. Is this valid? - takes input as data frame, and returns nothing with debug info on singular case
â Kirill L.
Sep 22 at 11:03
1
1
@KirillL. Yes, of course! Post it! (Optionally: and watch your clever answer being beaten by someoneâÂÂs Pyth or MATL string of seemingly random characters.) Just specify that the input format is this:
as.data.frame(matrix(c(Y,X1,...),n))
where n
indicates the number of observations.â Andreï Kostyrka
Sep 22 at 11:08
@KirillL. Yes, of course! Post it! (Optionally: and watch your clever answer being beaten by someoneâÂÂs Pyth or MATL string of seemingly random characters.) Just specify that the input format is this:
as.data.frame(matrix(c(Y,X1,...),n))
where n
indicates the number of observations.â Andreï Kostyrka
Sep 22 at 11:08
@AndreïKostyrka Can you clarify your test cases: E.g. in the first example, is
[4,5,6]
a column or a row?â flawr
Sep 22 at 13:38
@AndreïKostyrka Can you clarify your test cases: E.g. in the first example, is
[4,5,6]
a column or a row?â flawr
Sep 22 at 13:38
 |Â
show 6 more comments
3 Answers
3
active
oldest
votes
up vote
10
down vote
R, 34 bytes
function(x)try(lm(V1~.,x,si=F)$co)
Try it online!
Notes:
- Takes input as a data frame (which is what
lm
expects) lm
would in principle work even without formula with input of indicated format, but it is needed for the constant regression casesi=F
, short forsingular.ok=FALSE
throws an error for singular case, which is then caught bytry
, eventually returning nothing and printing the error as debug info.
Otherwise, the regression would actually return some output, but not the
expected one.
add a comment |Â
up vote
5
down vote
MATL, 18 15 10 bytes
llZ(GlZ)Y
Basically works the same as my Octave program. Thanks for -5 bytes @LuisMendo! This improvement was achieved by inputting the ones directly into the input instead of deleting a column and concatenating another one as well as some reordering of the steps.
Explanation
ll push two ones for later use in the next line
Z( implicitly take the input and plug in all ONES in to the FIRST column
Gl push input again, push a one for use in the next line
Z) get the FIRST column of the input matrix
Y multiply this column with the pseudo inverse of the matrix from the beginnign
Try it online!
add a comment |Â
up vote
3
down vote
Octave, 41 33 32 bytes
The whole magic happens at the : This operator left multiplies the second argument with the inverse of the first. If there is no inverse, it uses a suitable pseudo inverse.
x(:,1)
extracts the first column $ Y $ and [x(:,1).^0,...]
adds a column of ones to the remaining part $ X $ such that we also get the intercept beta0
.
Thanks @LuisMendo for -1 byte!
@(x)[(t=x(:,1)).^0,x(:,2:end)]t
Try it online!
That.^0
is a very clever way to get ones
â Luis Mendo
Sep 23 at 2:45
@LuisMendo My head was still in matlab mode, thanks a lot :D
â flawr
Sep 23 at 9:09
add a comment |Â
3 Answers
3
active
oldest
votes
3 Answers
3
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
10
down vote
R, 34 bytes
function(x)try(lm(V1~.,x,si=F)$co)
Try it online!
Notes:
- Takes input as a data frame (which is what
lm
expects) lm
would in principle work even without formula with input of indicated format, but it is needed for the constant regression casesi=F
, short forsingular.ok=FALSE
throws an error for singular case, which is then caught bytry
, eventually returning nothing and printing the error as debug info.
Otherwise, the regression would actually return some output, but not the
expected one.
add a comment |Â
up vote
10
down vote
R, 34 bytes
function(x)try(lm(V1~.,x,si=F)$co)
Try it online!
Notes:
- Takes input as a data frame (which is what
lm
expects) lm
would in principle work even without formula with input of indicated format, but it is needed for the constant regression casesi=F
, short forsingular.ok=FALSE
throws an error for singular case, which is then caught bytry
, eventually returning nothing and printing the error as debug info.
Otherwise, the regression would actually return some output, but not the
expected one.
add a comment |Â
up vote
10
down vote
up vote
10
down vote
R, 34 bytes
function(x)try(lm(V1~.,x,si=F)$co)
Try it online!
Notes:
- Takes input as a data frame (which is what
lm
expects) lm
would in principle work even without formula with input of indicated format, but it is needed for the constant regression casesi=F
, short forsingular.ok=FALSE
throws an error for singular case, which is then caught bytry
, eventually returning nothing and printing the error as debug info.
Otherwise, the regression would actually return some output, but not the
expected one.
R, 34 bytes
function(x)try(lm(V1~.,x,si=F)$co)
Try it online!
Notes:
- Takes input as a data frame (which is what
lm
expects) lm
would in principle work even without formula with input of indicated format, but it is needed for the constant regression casesi=F
, short forsingular.ok=FALSE
throws an error for singular case, which is then caught bytry
, eventually returning nothing and printing the error as debug info.
Otherwise, the regression would actually return some output, but not the
expected one.
edited Sep 22 at 11:17
answered Sep 22 at 11:10
Kirill L.
2,7361117
2,7361117
add a comment |Â
add a comment |Â
up vote
5
down vote
MATL, 18 15 10 bytes
llZ(GlZ)Y
Basically works the same as my Octave program. Thanks for -5 bytes @LuisMendo! This improvement was achieved by inputting the ones directly into the input instead of deleting a column and concatenating another one as well as some reordering of the steps.
Explanation
ll push two ones for later use in the next line
Z( implicitly take the input and plug in all ONES in to the FIRST column
Gl push input again, push a one for use in the next line
Z) get the FIRST column of the input matrix
Y multiply this column with the pseudo inverse of the matrix from the beginnign
Try it online!
add a comment |Â
up vote
5
down vote
MATL, 18 15 10 bytes
llZ(GlZ)Y
Basically works the same as my Octave program. Thanks for -5 bytes @LuisMendo! This improvement was achieved by inputting the ones directly into the input instead of deleting a column and concatenating another one as well as some reordering of the steps.
Explanation
ll push two ones for later use in the next line
Z( implicitly take the input and plug in all ONES in to the FIRST column
Gl push input again, push a one for use in the next line
Z) get the FIRST column of the input matrix
Y multiply this column with the pseudo inverse of the matrix from the beginnign
Try it online!
add a comment |Â
up vote
5
down vote
up vote
5
down vote
MATL, 18 15 10 bytes
llZ(GlZ)Y
Basically works the same as my Octave program. Thanks for -5 bytes @LuisMendo! This improvement was achieved by inputting the ones directly into the input instead of deleting a column and concatenating another one as well as some reordering of the steps.
Explanation
ll push two ones for later use in the next line
Z( implicitly take the input and plug in all ONES in to the FIRST column
Gl push input again, push a one for use in the next line
Z) get the FIRST column of the input matrix
Y multiply this column with the pseudo inverse of the matrix from the beginnign
Try it online!
MATL, 18 15 10 bytes
llZ(GlZ)Y
Basically works the same as my Octave program. Thanks for -5 bytes @LuisMendo! This improvement was achieved by inputting the ones directly into the input instead of deleting a column and concatenating another one as well as some reordering of the steps.
Explanation
ll push two ones for later use in the next line
Z( implicitly take the input and plug in all ONES in to the FIRST column
Gl push input again, push a one for use in the next line
Z) get the FIRST column of the input matrix
Y multiply this column with the pseudo inverse of the matrix from the beginnign
Try it online!
edited Sep 23 at 9:07
answered Sep 22 at 14:22
flawr
25.7k562177
25.7k562177
add a comment |Â
add a comment |Â
up vote
3
down vote
Octave, 41 33 32 bytes
The whole magic happens at the : This operator left multiplies the second argument with the inverse of the first. If there is no inverse, it uses a suitable pseudo inverse.
x(:,1)
extracts the first column $ Y $ and [x(:,1).^0,...]
adds a column of ones to the remaining part $ X $ such that we also get the intercept beta0
.
Thanks @LuisMendo for -1 byte!
@(x)[(t=x(:,1)).^0,x(:,2:end)]t
Try it online!
That.^0
is a very clever way to get ones
â Luis Mendo
Sep 23 at 2:45
@LuisMendo My head was still in matlab mode, thanks a lot :D
â flawr
Sep 23 at 9:09
add a comment |Â
up vote
3
down vote
Octave, 41 33 32 bytes
The whole magic happens at the : This operator left multiplies the second argument with the inverse of the first. If there is no inverse, it uses a suitable pseudo inverse.
x(:,1)
extracts the first column $ Y $ and [x(:,1).^0,...]
adds a column of ones to the remaining part $ X $ such that we also get the intercept beta0
.
Thanks @LuisMendo for -1 byte!
@(x)[(t=x(:,1)).^0,x(:,2:end)]t
Try it online!
That.^0
is a very clever way to get ones
â Luis Mendo
Sep 23 at 2:45
@LuisMendo My head was still in matlab mode, thanks a lot :D
â flawr
Sep 23 at 9:09
add a comment |Â
up vote
3
down vote
up vote
3
down vote
Octave, 41 33 32 bytes
The whole magic happens at the : This operator left multiplies the second argument with the inverse of the first. If there is no inverse, it uses a suitable pseudo inverse.
x(:,1)
extracts the first column $ Y $ and [x(:,1).^0,...]
adds a column of ones to the remaining part $ X $ such that we also get the intercept beta0
.
Thanks @LuisMendo for -1 byte!
@(x)[(t=x(:,1)).^0,x(:,2:end)]t
Try it online!
Octave, 41 33 32 bytes
The whole magic happens at the : This operator left multiplies the second argument with the inverse of the first. If there is no inverse, it uses a suitable pseudo inverse.
x(:,1)
extracts the first column $ Y $ and [x(:,1).^0,...]
adds a column of ones to the remaining part $ X $ such that we also get the intercept beta0
.
Thanks @LuisMendo for -1 byte!
@(x)[(t=x(:,1)).^0,x(:,2:end)]t
Try it online!
edited Sep 23 at 8:58
answered Sep 22 at 13:46
flawr
25.7k562177
25.7k562177
That.^0
is a very clever way to get ones
â Luis Mendo
Sep 23 at 2:45
@LuisMendo My head was still in matlab mode, thanks a lot :D
â flawr
Sep 23 at 9:09
add a comment |Â
That.^0
is a very clever way to get ones
â Luis Mendo
Sep 23 at 2:45
@LuisMendo My head was still in matlab mode, thanks a lot :D
â flawr
Sep 23 at 9:09
That
.^0
is a very clever way to get onesâ Luis Mendo
Sep 23 at 2:45
That
.^0
is a very clever way to get onesâ Luis Mendo
Sep 23 at 2:45
@LuisMendo My head was still in matlab mode, thanks a lot :D
â flawr
Sep 23 at 9:09
@LuisMendo My head was still in matlab mode, thanks a lot :D
â flawr
Sep 23 at 9:09
add a comment |Â
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodegolf.stackexchange.com%2fquestions%2f172643%2fordinary-least-squares-regression%23new-answer', 'question_page');
);
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
What about built-ins, does it have to be an original solver code, or using a built-in is OK?
â Kirill L.
Sep 22 at 10:36
@KirillL. Added this rule: Built-in functions are allowed as long as they are tweaked to produce a solution given the input matrix as an argument. That is, the two-byte answer
lm
in R is not a valid answer because it expects a different kind of input.â Andreï Kostyrka
Sep 22 at 10:45
Well, I guessed that :) Still a bit unsure about I/O though. Is this valid? - takes input as data frame, and returns nothing with debug info on singular case
â Kirill L.
Sep 22 at 11:03
1
@KirillL. Yes, of course! Post it! (Optionally: and watch your clever answer being beaten by someoneâÂÂs Pyth or MATL string of seemingly random characters.) Just specify that the input format is this:
as.data.frame(matrix(c(Y,X1,...),n))
wheren
indicates the number of observations.â Andreï Kostyrka
Sep 22 at 11:08
@AndreïKostyrka Can you clarify your test cases: E.g. in the first example, is
[4,5,6]
a column or a row?â flawr
Sep 22 at 13:38