Ordinary Least Squares Regression

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP











up vote
5
down vote

favorite












I am quite surprised that a variant of linear regression has been proposed for a challenge, whereas an estimation via ordinary least squares regression has not, despite the fact the this is arguably the most widely used method in applied economics, biology, psychology, and social sciences!



OLS visualisation



For details, check out the Wikipedia page on OLS. To keep things concise, suppose one has a model:



$$Y=beta_0+beta_1X_1+beta_2X_2+dots+beta_kX_k+U$$



where all right-hand side X variables—regressors—are linearly independent and are assumed to be exogenous (or at least uncorrelated) to the model error U. Then you solve the problem



$$min_beta_0,dots,beta_ksum_i=1^nleft(Y_i-beta_0-beta_1X_1-dots-beta_kX_kright)^2$$



on a sample of size n, given observations



$$left(beginmatrixY_1&X_1,1&cdots&X_k,1\vdots&vdots&ddots&vdots\Y_n&X_1,n&cdots&X_k,nendmatrixright)$$



The OLS solution to this problem as a vector looks like



$$hatbeta=(textbfX'textbfX)^-1(textbfX'textbfY)$$



where Y is the first column of the input matrix and X is a matrix made of a column of ones and remaining columns. This solution can be obtained via many numerical methods (matrix inversion, QR decomposition, Cholesky decomposition etc.), so pick your favourite!



Of course, econometricians prefer slightly different notation, but let’s just ignore them.



Non other than Gauss himself is watching you from the skies, so do not disappoint one of the greatest mathematicians of all times and write the shortest code possible.



Task



Given the observations in a matrix form as shown above, estimate the coefficients of the linear regression model via OLS.



Input



A matrix of values. The first column is always Y[1], ..., Y[n], the second column is X1[1], ..., X1[n], the next one is X2[1], ..., X2[n] etc. The column of ones is not given (as in real datasets), but you have to add it first in order to estimate beta0 (as in real models).



NB. In statistics, regressing on a constant is widely used. This means that the model is Y = b0 + U, and the OLS estimate of b0 is the sample average of Y. In this case, the input is just a matrix with one column, Y, and it is regressed on a column of ones.



You can safely assume that the variables are not exactly collinear, so the matrices above are invertible. In case your language cannot invert matrices with condition numbers larger than a certain threshold, state it explicitly, and provide a unique return value (that cannot be confused with output in case of success) denoting that the system seems to be computationally singular (S or any other unambiguous characters).



Output



OLS estimates of beta_0, ..., beta_k from a linear regression model in an unambiguous format or an indication that your language could not solve the system.



Challenge rules



  • I/O formats are flexible. A matrix can be several lines of space-delimited numbers separated by newlines, or an array of row vectors, or an array of column vectors etc.

  • This is code-golf, so shortest answer in bytes wins.

  • Built-in functions are allowed as long as they are tweaked to produce a solution given the input matrix as an argument. That is, the two-byte answer lm in R is not a valid answer because it expects a different kind of input.


  • Standard rules apply for your answer, so you are allowed to use STDIN/STDOUT, functions/method with the proper parameters and return-type, full programs.


  • Default loopholes are forbidden.

Test cases




  1. [[4,5,6],[1,2,3]] → output: [3,1]. Explanation: X = [[1,1,1], [1,2,3]], X'X = [[3,6],[6,14]], inverse(X'X) = [[2.333,-1],[-1,0.5]], X'Y = [15, 32], and inverse(X'X)⋅X'Y=[3, 1]


  2. [[5.5,4.1,10.5,7.7,6.6,7.2],[1.9,0.4,5.6,3.3,3.8,1.7],[4.2,2.2,3.2,3.2,2.5,6.6]] → output: [2.1171050,1.1351122,0.4539268].


  3. [[1,-2,3,-4],[1,2,3,4],[1,2,3,4.000001]] → output: [-1.3219977,6657598.3906250,-6657597.3945312] or S (any code for a computationally singular system).


  4. [[1,2,3,4,5]] → output: 3.

Bonus points (to your karma, not to your byte count) if your code can solve very ill-conditioned (quasi-multicollinear) problems (e. g. if you throw in an extra zero in the decimal part of X2[4] in Test Case 3) with high precision.



For fun: you can implement estimation of standard errors (White’s sandwich form or the simplified homoskedastic version) or any other kind of post-estimation diagnostics to amuse yourself if your language has built-in tools for that (I am looking at you, R, Python and Julia users). In other words, if your language allows to do cool stuff with regression objects in few bytes, you can show it!










share|improve this question























  • What about built-ins, does it have to be an original solver code, or using a built-in is OK?
    – Kirill L.
    Sep 22 at 10:36










  • @KirillL. Added this rule: Built-in functions are allowed as long as they are tweaked to produce a solution given the input matrix as an argument. That is, the two-byte answer lm in R is not a valid answer because it expects a different kind of input.
    – Andreï Kostyrka
    Sep 22 at 10:45










  • Well, I guessed that :) Still a bit unsure about I/O though. Is this valid? - takes input as data frame, and returns nothing with debug info on singular case
    – Kirill L.
    Sep 22 at 11:03






  • 1




    @KirillL. Yes, of course! Post it! (Optionally: and watch your clever answer being beaten by someone’s Pyth or MATL string of seemingly random characters.) Just specify that the input format is this: as.data.frame(matrix(c(Y,X1,...),n)) where n indicates the number of observations.
    – Andreï Kostyrka
    Sep 22 at 11:08











  • @AndreïKostyrka Can you clarify your test cases: E.g. in the first example, is [4,5,6] a column or a row?
    – flawr
    Sep 22 at 13:38














up vote
5
down vote

favorite












I am quite surprised that a variant of linear regression has been proposed for a challenge, whereas an estimation via ordinary least squares regression has not, despite the fact the this is arguably the most widely used method in applied economics, biology, psychology, and social sciences!



OLS visualisation



For details, check out the Wikipedia page on OLS. To keep things concise, suppose one has a model:



$$Y=beta_0+beta_1X_1+beta_2X_2+dots+beta_kX_k+U$$



where all right-hand side X variables—regressors—are linearly independent and are assumed to be exogenous (or at least uncorrelated) to the model error U. Then you solve the problem



$$min_beta_0,dots,beta_ksum_i=1^nleft(Y_i-beta_0-beta_1X_1-dots-beta_kX_kright)^2$$



on a sample of size n, given observations



$$left(beginmatrixY_1&X_1,1&cdots&X_k,1\vdots&vdots&ddots&vdots\Y_n&X_1,n&cdots&X_k,nendmatrixright)$$



The OLS solution to this problem as a vector looks like



$$hatbeta=(textbfX'textbfX)^-1(textbfX'textbfY)$$



where Y is the first column of the input matrix and X is a matrix made of a column of ones and remaining columns. This solution can be obtained via many numerical methods (matrix inversion, QR decomposition, Cholesky decomposition etc.), so pick your favourite!



Of course, econometricians prefer slightly different notation, but let’s just ignore them.



Non other than Gauss himself is watching you from the skies, so do not disappoint one of the greatest mathematicians of all times and write the shortest code possible.



Task



Given the observations in a matrix form as shown above, estimate the coefficients of the linear regression model via OLS.



Input



A matrix of values. The first column is always Y[1], ..., Y[n], the second column is X1[1], ..., X1[n], the next one is X2[1], ..., X2[n] etc. The column of ones is not given (as in real datasets), but you have to add it first in order to estimate beta0 (as in real models).



NB. In statistics, regressing on a constant is widely used. This means that the model is Y = b0 + U, and the OLS estimate of b0 is the sample average of Y. In this case, the input is just a matrix with one column, Y, and it is regressed on a column of ones.



You can safely assume that the variables are not exactly collinear, so the matrices above are invertible. In case your language cannot invert matrices with condition numbers larger than a certain threshold, state it explicitly, and provide a unique return value (that cannot be confused with output in case of success) denoting that the system seems to be computationally singular (S or any other unambiguous characters).



Output



OLS estimates of beta_0, ..., beta_k from a linear regression model in an unambiguous format or an indication that your language could not solve the system.



Challenge rules



  • I/O formats are flexible. A matrix can be several lines of space-delimited numbers separated by newlines, or an array of row vectors, or an array of column vectors etc.

  • This is code-golf, so shortest answer in bytes wins.

  • Built-in functions are allowed as long as they are tweaked to produce a solution given the input matrix as an argument. That is, the two-byte answer lm in R is not a valid answer because it expects a different kind of input.


  • Standard rules apply for your answer, so you are allowed to use STDIN/STDOUT, functions/method with the proper parameters and return-type, full programs.


  • Default loopholes are forbidden.

Test cases




  1. [[4,5,6],[1,2,3]] → output: [3,1]. Explanation: X = [[1,1,1], [1,2,3]], X'X = [[3,6],[6,14]], inverse(X'X) = [[2.333,-1],[-1,0.5]], X'Y = [15, 32], and inverse(X'X)⋅X'Y=[3, 1]


  2. [[5.5,4.1,10.5,7.7,6.6,7.2],[1.9,0.4,5.6,3.3,3.8,1.7],[4.2,2.2,3.2,3.2,2.5,6.6]] → output: [2.1171050,1.1351122,0.4539268].


  3. [[1,-2,3,-4],[1,2,3,4],[1,2,3,4.000001]] → output: [-1.3219977,6657598.3906250,-6657597.3945312] or S (any code for a computationally singular system).


  4. [[1,2,3,4,5]] → output: 3.

Bonus points (to your karma, not to your byte count) if your code can solve very ill-conditioned (quasi-multicollinear) problems (e. g. if you throw in an extra zero in the decimal part of X2[4] in Test Case 3) with high precision.



For fun: you can implement estimation of standard errors (White’s sandwich form or the simplified homoskedastic version) or any other kind of post-estimation diagnostics to amuse yourself if your language has built-in tools for that (I am looking at you, R, Python and Julia users). In other words, if your language allows to do cool stuff with regression objects in few bytes, you can show it!










share|improve this question























  • What about built-ins, does it have to be an original solver code, or using a built-in is OK?
    – Kirill L.
    Sep 22 at 10:36










  • @KirillL. Added this rule: Built-in functions are allowed as long as they are tweaked to produce a solution given the input matrix as an argument. That is, the two-byte answer lm in R is not a valid answer because it expects a different kind of input.
    – Andreï Kostyrka
    Sep 22 at 10:45










  • Well, I guessed that :) Still a bit unsure about I/O though. Is this valid? - takes input as data frame, and returns nothing with debug info on singular case
    – Kirill L.
    Sep 22 at 11:03






  • 1




    @KirillL. Yes, of course! Post it! (Optionally: and watch your clever answer being beaten by someone’s Pyth or MATL string of seemingly random characters.) Just specify that the input format is this: as.data.frame(matrix(c(Y,X1,...),n)) where n indicates the number of observations.
    – Andreï Kostyrka
    Sep 22 at 11:08











  • @AndreïKostyrka Can you clarify your test cases: E.g. in the first example, is [4,5,6] a column or a row?
    – flawr
    Sep 22 at 13:38












up vote
5
down vote

favorite









up vote
5
down vote

favorite











I am quite surprised that a variant of linear regression has been proposed for a challenge, whereas an estimation via ordinary least squares regression has not, despite the fact the this is arguably the most widely used method in applied economics, biology, psychology, and social sciences!



OLS visualisation



For details, check out the Wikipedia page on OLS. To keep things concise, suppose one has a model:



$$Y=beta_0+beta_1X_1+beta_2X_2+dots+beta_kX_k+U$$



where all right-hand side X variables—regressors—are linearly independent and are assumed to be exogenous (or at least uncorrelated) to the model error U. Then you solve the problem



$$min_beta_0,dots,beta_ksum_i=1^nleft(Y_i-beta_0-beta_1X_1-dots-beta_kX_kright)^2$$



on a sample of size n, given observations



$$left(beginmatrixY_1&X_1,1&cdots&X_k,1\vdots&vdots&ddots&vdots\Y_n&X_1,n&cdots&X_k,nendmatrixright)$$



The OLS solution to this problem as a vector looks like



$$hatbeta=(textbfX'textbfX)^-1(textbfX'textbfY)$$



where Y is the first column of the input matrix and X is a matrix made of a column of ones and remaining columns. This solution can be obtained via many numerical methods (matrix inversion, QR decomposition, Cholesky decomposition etc.), so pick your favourite!



Of course, econometricians prefer slightly different notation, but let’s just ignore them.



Non other than Gauss himself is watching you from the skies, so do not disappoint one of the greatest mathematicians of all times and write the shortest code possible.



Task



Given the observations in a matrix form as shown above, estimate the coefficients of the linear regression model via OLS.



Input



A matrix of values. The first column is always Y[1], ..., Y[n], the second column is X1[1], ..., X1[n], the next one is X2[1], ..., X2[n] etc. The column of ones is not given (as in real datasets), but you have to add it first in order to estimate beta0 (as in real models).



NB. In statistics, regressing on a constant is widely used. This means that the model is Y = b0 + U, and the OLS estimate of b0 is the sample average of Y. In this case, the input is just a matrix with one column, Y, and it is regressed on a column of ones.



You can safely assume that the variables are not exactly collinear, so the matrices above are invertible. In case your language cannot invert matrices with condition numbers larger than a certain threshold, state it explicitly, and provide a unique return value (that cannot be confused with output in case of success) denoting that the system seems to be computationally singular (S or any other unambiguous characters).



Output



OLS estimates of beta_0, ..., beta_k from a linear regression model in an unambiguous format or an indication that your language could not solve the system.



Challenge rules



  • I/O formats are flexible. A matrix can be several lines of space-delimited numbers separated by newlines, or an array of row vectors, or an array of column vectors etc.

  • This is code-golf, so shortest answer in bytes wins.

  • Built-in functions are allowed as long as they are tweaked to produce a solution given the input matrix as an argument. That is, the two-byte answer lm in R is not a valid answer because it expects a different kind of input.


  • Standard rules apply for your answer, so you are allowed to use STDIN/STDOUT, functions/method with the proper parameters and return-type, full programs.


  • Default loopholes are forbidden.

Test cases




  1. [[4,5,6],[1,2,3]] → output: [3,1]. Explanation: X = [[1,1,1], [1,2,3]], X'X = [[3,6],[6,14]], inverse(X'X) = [[2.333,-1],[-1,0.5]], X'Y = [15, 32], and inverse(X'X)⋅X'Y=[3, 1]


  2. [[5.5,4.1,10.5,7.7,6.6,7.2],[1.9,0.4,5.6,3.3,3.8,1.7],[4.2,2.2,3.2,3.2,2.5,6.6]] → output: [2.1171050,1.1351122,0.4539268].


  3. [[1,-2,3,-4],[1,2,3,4],[1,2,3,4.000001]] → output: [-1.3219977,6657598.3906250,-6657597.3945312] or S (any code for a computationally singular system).


  4. [[1,2,3,4,5]] → output: 3.

Bonus points (to your karma, not to your byte count) if your code can solve very ill-conditioned (quasi-multicollinear) problems (e. g. if you throw in an extra zero in the decimal part of X2[4] in Test Case 3) with high precision.



For fun: you can implement estimation of standard errors (White’s sandwich form or the simplified homoskedastic version) or any other kind of post-estimation diagnostics to amuse yourself if your language has built-in tools for that (I am looking at you, R, Python and Julia users). In other words, if your language allows to do cool stuff with regression objects in few bytes, you can show it!










share|improve this question















I am quite surprised that a variant of linear regression has been proposed for a challenge, whereas an estimation via ordinary least squares regression has not, despite the fact the this is arguably the most widely used method in applied economics, biology, psychology, and social sciences!



OLS visualisation



For details, check out the Wikipedia page on OLS. To keep things concise, suppose one has a model:



$$Y=beta_0+beta_1X_1+beta_2X_2+dots+beta_kX_k+U$$



where all right-hand side X variables—regressors—are linearly independent and are assumed to be exogenous (or at least uncorrelated) to the model error U. Then you solve the problem



$$min_beta_0,dots,beta_ksum_i=1^nleft(Y_i-beta_0-beta_1X_1-dots-beta_kX_kright)^2$$



on a sample of size n, given observations



$$left(beginmatrixY_1&X_1,1&cdots&X_k,1\vdots&vdots&ddots&vdots\Y_n&X_1,n&cdots&X_k,nendmatrixright)$$



The OLS solution to this problem as a vector looks like



$$hatbeta=(textbfX'textbfX)^-1(textbfX'textbfY)$$



where Y is the first column of the input matrix and X is a matrix made of a column of ones and remaining columns. This solution can be obtained via many numerical methods (matrix inversion, QR decomposition, Cholesky decomposition etc.), so pick your favourite!



Of course, econometricians prefer slightly different notation, but let’s just ignore them.



Non other than Gauss himself is watching you from the skies, so do not disappoint one of the greatest mathematicians of all times and write the shortest code possible.



Task



Given the observations in a matrix form as shown above, estimate the coefficients of the linear regression model via OLS.



Input



A matrix of values. The first column is always Y[1], ..., Y[n], the second column is X1[1], ..., X1[n], the next one is X2[1], ..., X2[n] etc. The column of ones is not given (as in real datasets), but you have to add it first in order to estimate beta0 (as in real models).



NB. In statistics, regressing on a constant is widely used. This means that the model is Y = b0 + U, and the OLS estimate of b0 is the sample average of Y. In this case, the input is just a matrix with one column, Y, and it is regressed on a column of ones.



You can safely assume that the variables are not exactly collinear, so the matrices above are invertible. In case your language cannot invert matrices with condition numbers larger than a certain threshold, state it explicitly, and provide a unique return value (that cannot be confused with output in case of success) denoting that the system seems to be computationally singular (S or any other unambiguous characters).



Output



OLS estimates of beta_0, ..., beta_k from a linear regression model in an unambiguous format or an indication that your language could not solve the system.



Challenge rules



  • I/O formats are flexible. A matrix can be several lines of space-delimited numbers separated by newlines, or an array of row vectors, or an array of column vectors etc.

  • This is code-golf, so shortest answer in bytes wins.

  • Built-in functions are allowed as long as they are tweaked to produce a solution given the input matrix as an argument. That is, the two-byte answer lm in R is not a valid answer because it expects a different kind of input.


  • Standard rules apply for your answer, so you are allowed to use STDIN/STDOUT, functions/method with the proper parameters and return-type, full programs.


  • Default loopholes are forbidden.

Test cases




  1. [[4,5,6],[1,2,3]] → output: [3,1]. Explanation: X = [[1,1,1], [1,2,3]], X'X = [[3,6],[6,14]], inverse(X'X) = [[2.333,-1],[-1,0.5]], X'Y = [15, 32], and inverse(X'X)⋅X'Y=[3, 1]


  2. [[5.5,4.1,10.5,7.7,6.6,7.2],[1.9,0.4,5.6,3.3,3.8,1.7],[4.2,2.2,3.2,3.2,2.5,6.6]] → output: [2.1171050,1.1351122,0.4539268].


  3. [[1,-2,3,-4],[1,2,3,4],[1,2,3,4.000001]] → output: [-1.3219977,6657598.3906250,-6657597.3945312] or S (any code for a computationally singular system).


  4. [[1,2,3,4,5]] → output: 3.

Bonus points (to your karma, not to your byte count) if your code can solve very ill-conditioned (quasi-multicollinear) problems (e. g. if you throw in an extra zero in the decimal part of X2[4] in Test Case 3) with high precision.



For fun: you can implement estimation of standard errors (White’s sandwich form or the simplified homoskedastic version) or any other kind of post-estimation diagnostics to amuse yourself if your language has built-in tools for that (I am looking at you, R, Python and Julia users). In other words, if your language allows to do cool stuff with regression objects in few bytes, you can show it!







code-golf matrix statistics






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Sep 24 at 9:03

























asked Sep 22 at 10:11









Andreï Kostyrka

1,179617




1,179617











  • What about built-ins, does it have to be an original solver code, or using a built-in is OK?
    – Kirill L.
    Sep 22 at 10:36










  • @KirillL. Added this rule: Built-in functions are allowed as long as they are tweaked to produce a solution given the input matrix as an argument. That is, the two-byte answer lm in R is not a valid answer because it expects a different kind of input.
    – Andreï Kostyrka
    Sep 22 at 10:45










  • Well, I guessed that :) Still a bit unsure about I/O though. Is this valid? - takes input as data frame, and returns nothing with debug info on singular case
    – Kirill L.
    Sep 22 at 11:03






  • 1




    @KirillL. Yes, of course! Post it! (Optionally: and watch your clever answer being beaten by someone’s Pyth or MATL string of seemingly random characters.) Just specify that the input format is this: as.data.frame(matrix(c(Y,X1,...),n)) where n indicates the number of observations.
    – Andreï Kostyrka
    Sep 22 at 11:08











  • @AndreïKostyrka Can you clarify your test cases: E.g. in the first example, is [4,5,6] a column or a row?
    – flawr
    Sep 22 at 13:38
















  • What about built-ins, does it have to be an original solver code, or using a built-in is OK?
    – Kirill L.
    Sep 22 at 10:36










  • @KirillL. Added this rule: Built-in functions are allowed as long as they are tweaked to produce a solution given the input matrix as an argument. That is, the two-byte answer lm in R is not a valid answer because it expects a different kind of input.
    – Andreï Kostyrka
    Sep 22 at 10:45










  • Well, I guessed that :) Still a bit unsure about I/O though. Is this valid? - takes input as data frame, and returns nothing with debug info on singular case
    – Kirill L.
    Sep 22 at 11:03






  • 1




    @KirillL. Yes, of course! Post it! (Optionally: and watch your clever answer being beaten by someone’s Pyth or MATL string of seemingly random characters.) Just specify that the input format is this: as.data.frame(matrix(c(Y,X1,...),n)) where n indicates the number of observations.
    – Andreï Kostyrka
    Sep 22 at 11:08











  • @AndreïKostyrka Can you clarify your test cases: E.g. in the first example, is [4,5,6] a column or a row?
    – flawr
    Sep 22 at 13:38















What about built-ins, does it have to be an original solver code, or using a built-in is OK?
– Kirill L.
Sep 22 at 10:36




What about built-ins, does it have to be an original solver code, or using a built-in is OK?
– Kirill L.
Sep 22 at 10:36












@KirillL. Added this rule: Built-in functions are allowed as long as they are tweaked to produce a solution given the input matrix as an argument. That is, the two-byte answer lm in R is not a valid answer because it expects a different kind of input.
– Andreï Kostyrka
Sep 22 at 10:45




@KirillL. Added this rule: Built-in functions are allowed as long as they are tweaked to produce a solution given the input matrix as an argument. That is, the two-byte answer lm in R is not a valid answer because it expects a different kind of input.
– Andreï Kostyrka
Sep 22 at 10:45












Well, I guessed that :) Still a bit unsure about I/O though. Is this valid? - takes input as data frame, and returns nothing with debug info on singular case
– Kirill L.
Sep 22 at 11:03




Well, I guessed that :) Still a bit unsure about I/O though. Is this valid? - takes input as data frame, and returns nothing with debug info on singular case
– Kirill L.
Sep 22 at 11:03




1




1




@KirillL. Yes, of course! Post it! (Optionally: and watch your clever answer being beaten by someone’s Pyth or MATL string of seemingly random characters.) Just specify that the input format is this: as.data.frame(matrix(c(Y,X1,...),n)) where n indicates the number of observations.
– Andreï Kostyrka
Sep 22 at 11:08





@KirillL. Yes, of course! Post it! (Optionally: and watch your clever answer being beaten by someone’s Pyth or MATL string of seemingly random characters.) Just specify that the input format is this: as.data.frame(matrix(c(Y,X1,...),n)) where n indicates the number of observations.
– Andreï Kostyrka
Sep 22 at 11:08













@AndreïKostyrka Can you clarify your test cases: E.g. in the first example, is [4,5,6] a column or a row?
– flawr
Sep 22 at 13:38




@AndreïKostyrka Can you clarify your test cases: E.g. in the first example, is [4,5,6] a column or a row?
– flawr
Sep 22 at 13:38










3 Answers
3






active

oldest

votes

















up vote
10
down vote














R, 34 bytes





function(x)try(lm(V1~.,x,si=F)$co)


Try it online!



Notes:



  • Takes input as a data frame (which is what lm expects)


  • lm would in principle work even without formula with input of indicated format, but it is needed for the constant regression case


  • si=F, short for singular.ok=FALSE
    throws an error for singular case, which is then caught by try, eventually returning nothing and printing the error as debug info.
    Otherwise, the regression would actually return some output, but not the
    expected one.





share|improve this answer





























    up vote
    5
    down vote














    MATL, 18 15 10 bytes



    llZ(GlZ)Y


    Basically works the same as my Octave program. Thanks for -5 bytes @LuisMendo! This improvement was achieved by inputting the ones directly into the input instead of deleting a column and concatenating another one as well as some reordering of the steps.



    Explanation



    ll push two ones for later use in the next line
    Z( implicitly take the input and plug in all ONES in to the FIRST column
    Gl push input again, push a one for use in the next line
    Z) get the FIRST column of the input matrix
    Y multiply this column with the pseudo inverse of the matrix from the beginnign


    Try it online!






    share|improve this answer





























      up vote
      3
      down vote














      Octave, 41 33 32 bytes



      The whole magic happens at the : This operator left multiplies the second argument with the inverse of the first. If there is no inverse, it uses a suitable pseudo inverse. x(:,1) extracts the first column $ Y $ and [x(:,1).^0,...] adds a column of ones to the remaining part $ X $ such that we also get the intercept beta0.



      Thanks @LuisMendo for -1 byte!





      @(x)[(t=x(:,1)).^0,x(:,2:end)]t


      Try it online!






      share|improve this answer






















      • That .^0 is a very clever way to get ones
        – Luis Mendo
        Sep 23 at 2:45











      • @LuisMendo My head was still in matlab mode, thanks a lot :D
        – flawr
        Sep 23 at 9:09










      Your Answer




      StackExchange.ifUsing("editor", function ()
      return StackExchange.using("mathjaxEditing", function ()
      StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
      StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["\$", "\$"]]);
      );
      );
      , "mathjax-editing");

      StackExchange.ifUsing("editor", function ()
      StackExchange.using("externalEditor", function ()
      StackExchange.using("snippets", function ()
      StackExchange.snippets.init();
      );
      );
      , "code-snippets");

      StackExchange.ready(function()
      var channelOptions =
      tags: "".split(" "),
      id: "200"
      ;
      initTagRenderer("".split(" "), "".split(" "), channelOptions);

      StackExchange.using("externalEditor", function()
      // Have to fire editor after snippets, if snippets enabled
      if (StackExchange.settings.snippets.snippetsEnabled)
      StackExchange.using("snippets", function()
      createEditor();
      );

      else
      createEditor();

      );

      function createEditor()
      StackExchange.prepareEditor(
      heartbeatType: 'answer',
      convertImagesToLinks: false,
      noModals: false,
      showLowRepImageUploadWarning: true,
      reputationToPostImages: null,
      bindNavPrevention: true,
      postfix: "",
      onDemand: true,
      discardSelector: ".discard-answer"
      ,immediatelyShowMarkdownHelp:true
      );



      );













       

      draft saved


      draft discarded


















      StackExchange.ready(
      function ()
      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodegolf.stackexchange.com%2fquestions%2f172643%2fordinary-least-squares-regression%23new-answer', 'question_page');

      );

      Post as a guest






























      3 Answers
      3






      active

      oldest

      votes








      3 Answers
      3






      active

      oldest

      votes









      active

      oldest

      votes






      active

      oldest

      votes








      up vote
      10
      down vote














      R, 34 bytes





      function(x)try(lm(V1~.,x,si=F)$co)


      Try it online!



      Notes:



      • Takes input as a data frame (which is what lm expects)


      • lm would in principle work even without formula with input of indicated format, but it is needed for the constant regression case


      • si=F, short for singular.ok=FALSE
        throws an error for singular case, which is then caught by try, eventually returning nothing and printing the error as debug info.
        Otherwise, the regression would actually return some output, but not the
        expected one.





      share|improve this answer


























        up vote
        10
        down vote














        R, 34 bytes





        function(x)try(lm(V1~.,x,si=F)$co)


        Try it online!



        Notes:



        • Takes input as a data frame (which is what lm expects)


        • lm would in principle work even without formula with input of indicated format, but it is needed for the constant regression case


        • si=F, short for singular.ok=FALSE
          throws an error for singular case, which is then caught by try, eventually returning nothing and printing the error as debug info.
          Otherwise, the regression would actually return some output, but not the
          expected one.





        share|improve this answer
























          up vote
          10
          down vote










          up vote
          10
          down vote










          R, 34 bytes





          function(x)try(lm(V1~.,x,si=F)$co)


          Try it online!



          Notes:



          • Takes input as a data frame (which is what lm expects)


          • lm would in principle work even without formula with input of indicated format, but it is needed for the constant regression case


          • si=F, short for singular.ok=FALSE
            throws an error for singular case, which is then caught by try, eventually returning nothing and printing the error as debug info.
            Otherwise, the regression would actually return some output, but not the
            expected one.





          share|improve this answer















          R, 34 bytes





          function(x)try(lm(V1~.,x,si=F)$co)


          Try it online!



          Notes:



          • Takes input as a data frame (which is what lm expects)


          • lm would in principle work even without formula with input of indicated format, but it is needed for the constant regression case


          • si=F, short for singular.ok=FALSE
            throws an error for singular case, which is then caught by try, eventually returning nothing and printing the error as debug info.
            Otherwise, the regression would actually return some output, but not the
            expected one.






          share|improve this answer














          share|improve this answer



          share|improve this answer








          edited Sep 22 at 11:17

























          answered Sep 22 at 11:10









          Kirill L.

          2,7361117




          2,7361117




















              up vote
              5
              down vote














              MATL, 18 15 10 bytes



              llZ(GlZ)Y


              Basically works the same as my Octave program. Thanks for -5 bytes @LuisMendo! This improvement was achieved by inputting the ones directly into the input instead of deleting a column and concatenating another one as well as some reordering of the steps.



              Explanation



              ll push two ones for later use in the next line
              Z( implicitly take the input and plug in all ONES in to the FIRST column
              Gl push input again, push a one for use in the next line
              Z) get the FIRST column of the input matrix
              Y multiply this column with the pseudo inverse of the matrix from the beginnign


              Try it online!






              share|improve this answer


























                up vote
                5
                down vote














                MATL, 18 15 10 bytes



                llZ(GlZ)Y


                Basically works the same as my Octave program. Thanks for -5 bytes @LuisMendo! This improvement was achieved by inputting the ones directly into the input instead of deleting a column and concatenating another one as well as some reordering of the steps.



                Explanation



                ll push two ones for later use in the next line
                Z( implicitly take the input and plug in all ONES in to the FIRST column
                Gl push input again, push a one for use in the next line
                Z) get the FIRST column of the input matrix
                Y multiply this column with the pseudo inverse of the matrix from the beginnign


                Try it online!






                share|improve this answer
























                  up vote
                  5
                  down vote










                  up vote
                  5
                  down vote










                  MATL, 18 15 10 bytes



                  llZ(GlZ)Y


                  Basically works the same as my Octave program. Thanks for -5 bytes @LuisMendo! This improvement was achieved by inputting the ones directly into the input instead of deleting a column and concatenating another one as well as some reordering of the steps.



                  Explanation



                  ll push two ones for later use in the next line
                  Z( implicitly take the input and plug in all ONES in to the FIRST column
                  Gl push input again, push a one for use in the next line
                  Z) get the FIRST column of the input matrix
                  Y multiply this column with the pseudo inverse of the matrix from the beginnign


                  Try it online!






                  share|improve this answer















                  MATL, 18 15 10 bytes



                  llZ(GlZ)Y


                  Basically works the same as my Octave program. Thanks for -5 bytes @LuisMendo! This improvement was achieved by inputting the ones directly into the input instead of deleting a column and concatenating another one as well as some reordering of the steps.



                  Explanation



                  ll push two ones for later use in the next line
                  Z( implicitly take the input and plug in all ONES in to the FIRST column
                  Gl push input again, push a one for use in the next line
                  Z) get the FIRST column of the input matrix
                  Y multiply this column with the pseudo inverse of the matrix from the beginnign


                  Try it online!







                  share|improve this answer














                  share|improve this answer



                  share|improve this answer








                  edited Sep 23 at 9:07

























                  answered Sep 22 at 14:22









                  flawr

                  25.7k562177




                  25.7k562177




















                      up vote
                      3
                      down vote














                      Octave, 41 33 32 bytes



                      The whole magic happens at the : This operator left multiplies the second argument with the inverse of the first. If there is no inverse, it uses a suitable pseudo inverse. x(:,1) extracts the first column $ Y $ and [x(:,1).^0,...] adds a column of ones to the remaining part $ X $ such that we also get the intercept beta0.



                      Thanks @LuisMendo for -1 byte!





                      @(x)[(t=x(:,1)).^0,x(:,2:end)]t


                      Try it online!






                      share|improve this answer






















                      • That .^0 is a very clever way to get ones
                        – Luis Mendo
                        Sep 23 at 2:45











                      • @LuisMendo My head was still in matlab mode, thanks a lot :D
                        – flawr
                        Sep 23 at 9:09














                      up vote
                      3
                      down vote














                      Octave, 41 33 32 bytes



                      The whole magic happens at the : This operator left multiplies the second argument with the inverse of the first. If there is no inverse, it uses a suitable pseudo inverse. x(:,1) extracts the first column $ Y $ and [x(:,1).^0,...] adds a column of ones to the remaining part $ X $ such that we also get the intercept beta0.



                      Thanks @LuisMendo for -1 byte!





                      @(x)[(t=x(:,1)).^0,x(:,2:end)]t


                      Try it online!






                      share|improve this answer






















                      • That .^0 is a very clever way to get ones
                        – Luis Mendo
                        Sep 23 at 2:45











                      • @LuisMendo My head was still in matlab mode, thanks a lot :D
                        – flawr
                        Sep 23 at 9:09












                      up vote
                      3
                      down vote










                      up vote
                      3
                      down vote










                      Octave, 41 33 32 bytes



                      The whole magic happens at the : This operator left multiplies the second argument with the inverse of the first. If there is no inverse, it uses a suitable pseudo inverse. x(:,1) extracts the first column $ Y $ and [x(:,1).^0,...] adds a column of ones to the remaining part $ X $ such that we also get the intercept beta0.



                      Thanks @LuisMendo for -1 byte!





                      @(x)[(t=x(:,1)).^0,x(:,2:end)]t


                      Try it online!






                      share|improve this answer















                      Octave, 41 33 32 bytes



                      The whole magic happens at the : This operator left multiplies the second argument with the inverse of the first. If there is no inverse, it uses a suitable pseudo inverse. x(:,1) extracts the first column $ Y $ and [x(:,1).^0,...] adds a column of ones to the remaining part $ X $ such that we also get the intercept beta0.



                      Thanks @LuisMendo for -1 byte!





                      @(x)[(t=x(:,1)).^0,x(:,2:end)]t


                      Try it online!







                      share|improve this answer














                      share|improve this answer



                      share|improve this answer








                      edited Sep 23 at 8:58

























                      answered Sep 22 at 13:46









                      flawr

                      25.7k562177




                      25.7k562177











                      • That .^0 is a very clever way to get ones
                        – Luis Mendo
                        Sep 23 at 2:45











                      • @LuisMendo My head was still in matlab mode, thanks a lot :D
                        – flawr
                        Sep 23 at 9:09
















                      • That .^0 is a very clever way to get ones
                        – Luis Mendo
                        Sep 23 at 2:45











                      • @LuisMendo My head was still in matlab mode, thanks a lot :D
                        – flawr
                        Sep 23 at 9:09















                      That .^0 is a very clever way to get ones
                      – Luis Mendo
                      Sep 23 at 2:45





                      That .^0 is a very clever way to get ones
                      – Luis Mendo
                      Sep 23 at 2:45













                      @LuisMendo My head was still in matlab mode, thanks a lot :D
                      – flawr
                      Sep 23 at 9:09




                      @LuisMendo My head was still in matlab mode, thanks a lot :D
                      – flawr
                      Sep 23 at 9:09

















                       

                      draft saved


                      draft discarded















































                       


                      draft saved


                      draft discarded














                      StackExchange.ready(
                      function ()
                      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodegolf.stackexchange.com%2fquestions%2f172643%2fordinary-least-squares-regression%23new-answer', 'question_page');

                      );

                      Post as a guest













































































                      Popular posts from this blog

                      How to check contact read email or not when send email to Individual?

                      Displaying single band from multi-band raster using QGIS

                      How many registers does an x86_64 CPU actually have?