big data + what is the right filesystem ext4 or xfs?

Clash Royale CLAN TAG#URR8PPP
up vote
4
down vote
favorite
We have Linux Redhat version 7.2 , with xfs file system.
from /etc/fstab
/dev/mapper/vgCLU_HDP-root / xfs defaults 0 0
UUID=7de1ab5c-b605-4b6f-bdf1-f1e8658fb9 /boot xfs defaults 0 0
/dev/mapper/vg
/dev/mapper/vgCLU_HDP-root / xfs defaults 0 0
UUID=7de1dc5c-b605-4a6f-bdf1-f1e869f6ffb9 /boot xfs defaults 0 0
/dev/mapper/vgCLU_HDP-var /var xfs defaults 0 0 var /var xfs defaults 0 0
The machines are used for hadoop clusters.
I just thinking what is the best file-system for this purpose?
So what is better EXT4, or XFS regarding that machines are used for hadoop cluster?
linux filesystems rhel ext4 xfs
add a comment |Â
up vote
4
down vote
favorite
We have Linux Redhat version 7.2 , with xfs file system.
from /etc/fstab
/dev/mapper/vgCLU_HDP-root / xfs defaults 0 0
UUID=7de1ab5c-b605-4b6f-bdf1-f1e8658fb9 /boot xfs defaults 0 0
/dev/mapper/vg
/dev/mapper/vgCLU_HDP-root / xfs defaults 0 0
UUID=7de1dc5c-b605-4a6f-bdf1-f1e869f6ffb9 /boot xfs defaults 0 0
/dev/mapper/vgCLU_HDP-var /var xfs defaults 0 0 var /var xfs defaults 0 0
The machines are used for hadoop clusters.
I just thinking what is the best file-system for this purpose?
So what is better EXT4, or XFS regarding that machines are used for hadoop cluster?
linux filesystems rhel ext4 xfs
Both work fine. I'm running a cluster using both (mostly because the provision scripts create XFS and we forget to reformat the disks before installing Hadoop)
â cricket_007
Apr 30 at 4:11
I assume that you mean Red Hat Enterprise Linux, not 7.2 Red Hat Linux 7.2, right?
â mattdm
Apr 30 at 5:22
yes we have redhat 7.2 version
â yael
Apr 30 at 10:04
add a comment |Â
up vote
4
down vote
favorite
up vote
4
down vote
favorite
We have Linux Redhat version 7.2 , with xfs file system.
from /etc/fstab
/dev/mapper/vgCLU_HDP-root / xfs defaults 0 0
UUID=7de1ab5c-b605-4b6f-bdf1-f1e8658fb9 /boot xfs defaults 0 0
/dev/mapper/vg
/dev/mapper/vgCLU_HDP-root / xfs defaults 0 0
UUID=7de1dc5c-b605-4a6f-bdf1-f1e869f6ffb9 /boot xfs defaults 0 0
/dev/mapper/vgCLU_HDP-var /var xfs defaults 0 0 var /var xfs defaults 0 0
The machines are used for hadoop clusters.
I just thinking what is the best file-system for this purpose?
So what is better EXT4, or XFS regarding that machines are used for hadoop cluster?
linux filesystems rhel ext4 xfs
We have Linux Redhat version 7.2 , with xfs file system.
from /etc/fstab
/dev/mapper/vgCLU_HDP-root / xfs defaults 0 0
UUID=7de1ab5c-b605-4b6f-bdf1-f1e8658fb9 /boot xfs defaults 0 0
/dev/mapper/vg
/dev/mapper/vgCLU_HDP-root / xfs defaults 0 0
UUID=7de1dc5c-b605-4a6f-bdf1-f1e869f6ffb9 /boot xfs defaults 0 0
/dev/mapper/vgCLU_HDP-var /var xfs defaults 0 0 var /var xfs defaults 0 0
The machines are used for hadoop clusters.
I just thinking what is the best file-system for this purpose?
So what is better EXT4, or XFS regarding that machines are used for hadoop cluster?
linux filesystems rhel ext4 xfs
edited Apr 30 at 4:06
A. Rawson
133
133
asked Apr 29 at 15:20
yael
1,9351144
1,9351144
Both work fine. I'm running a cluster using both (mostly because the provision scripts create XFS and we forget to reformat the disks before installing Hadoop)
â cricket_007
Apr 30 at 4:11
I assume that you mean Red Hat Enterprise Linux, not 7.2 Red Hat Linux 7.2, right?
â mattdm
Apr 30 at 5:22
yes we have redhat 7.2 version
â yael
Apr 30 at 10:04
add a comment |Â
Both work fine. I'm running a cluster using both (mostly because the provision scripts create XFS and we forget to reformat the disks before installing Hadoop)
â cricket_007
Apr 30 at 4:11
I assume that you mean Red Hat Enterprise Linux, not 7.2 Red Hat Linux 7.2, right?
â mattdm
Apr 30 at 5:22
yes we have redhat 7.2 version
â yael
Apr 30 at 10:04
Both work fine. I'm running a cluster using both (mostly because the provision scripts create XFS and we forget to reformat the disks before installing Hadoop)
â cricket_007
Apr 30 at 4:11
Both work fine. I'm running a cluster using both (mostly because the provision scripts create XFS and we forget to reformat the disks before installing Hadoop)
â cricket_007
Apr 30 at 4:11
I assume that you mean Red Hat Enterprise Linux, not 7.2 Red Hat Linux 7.2, right?
â mattdm
Apr 30 at 5:22
I assume that you mean Red Hat Enterprise Linux, not 7.2 Red Hat Linux 7.2, right?
â mattdm
Apr 30 at 5:22
yes we have redhat 7.2 version
â yael
Apr 30 at 10:04
yes we have redhat 7.2 version
â yael
Apr 30 at 10:04
add a comment |Â
2 Answers
2
active
oldest
votes
up vote
4
down vote
accepted
This is addressed in this knowledge base article; the main consideration for you will be the support levels available: Ext4 is supported up to 50TB, XFS up to 500TB. For really big data, youâÂÂd probably end up looking at shared storage, which by default means GFS2 on RHEL 7, except that for Hadoop youâÂÂd use HDFS or GlusterFS.
For local storage on RHEL the default is XFS and you should generally use that unless you have specific reasons not to.
add a comment |Â
up vote
2
down vote
XFS is an amazing filesystem, especially for large files. If your load involves lots of small files, cleaning up any fragmentation periodically may improve performance. I don't worry about it and use XFS for all loads. It is well supported, so no reason not to use it.
Set aside a machine and disk for your own testing of various filesystems, if you want to find out what is best for your typical work load. Working the test load in steps over the entire disk can tell you something about how the filesystem being tested works.
Testing your load on your machine is the only way to be sure.
add a comment |Â
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
4
down vote
accepted
This is addressed in this knowledge base article; the main consideration for you will be the support levels available: Ext4 is supported up to 50TB, XFS up to 500TB. For really big data, youâÂÂd probably end up looking at shared storage, which by default means GFS2 on RHEL 7, except that for Hadoop youâÂÂd use HDFS or GlusterFS.
For local storage on RHEL the default is XFS and you should generally use that unless you have specific reasons not to.
add a comment |Â
up vote
4
down vote
accepted
This is addressed in this knowledge base article; the main consideration for you will be the support levels available: Ext4 is supported up to 50TB, XFS up to 500TB. For really big data, youâÂÂd probably end up looking at shared storage, which by default means GFS2 on RHEL 7, except that for Hadoop youâÂÂd use HDFS or GlusterFS.
For local storage on RHEL the default is XFS and you should generally use that unless you have specific reasons not to.
add a comment |Â
up vote
4
down vote
accepted
up vote
4
down vote
accepted
This is addressed in this knowledge base article; the main consideration for you will be the support levels available: Ext4 is supported up to 50TB, XFS up to 500TB. For really big data, youâÂÂd probably end up looking at shared storage, which by default means GFS2 on RHEL 7, except that for Hadoop youâÂÂd use HDFS or GlusterFS.
For local storage on RHEL the default is XFS and you should generally use that unless you have specific reasons not to.
This is addressed in this knowledge base article; the main consideration for you will be the support levels available: Ext4 is supported up to 50TB, XFS up to 500TB. For really big data, youâÂÂd probably end up looking at shared storage, which by default means GFS2 on RHEL 7, except that for Hadoop youâÂÂd use HDFS or GlusterFS.
For local storage on RHEL the default is XFS and you should generally use that unless you have specific reasons not to.
edited Apr 30 at 4:45
answered Apr 29 at 15:47
Stephen Kitt
140k22302363
140k22302363
add a comment |Â
add a comment |Â
up vote
2
down vote
XFS is an amazing filesystem, especially for large files. If your load involves lots of small files, cleaning up any fragmentation periodically may improve performance. I don't worry about it and use XFS for all loads. It is well supported, so no reason not to use it.
Set aside a machine and disk for your own testing of various filesystems, if you want to find out what is best for your typical work load. Working the test load in steps over the entire disk can tell you something about how the filesystem being tested works.
Testing your load on your machine is the only way to be sure.
add a comment |Â
up vote
2
down vote
XFS is an amazing filesystem, especially for large files. If your load involves lots of small files, cleaning up any fragmentation periodically may improve performance. I don't worry about it and use XFS for all loads. It is well supported, so no reason not to use it.
Set aside a machine and disk for your own testing of various filesystems, if you want to find out what is best for your typical work load. Working the test load in steps over the entire disk can tell you something about how the filesystem being tested works.
Testing your load on your machine is the only way to be sure.
add a comment |Â
up vote
2
down vote
up vote
2
down vote
XFS is an amazing filesystem, especially for large files. If your load involves lots of small files, cleaning up any fragmentation periodically may improve performance. I don't worry about it and use XFS for all loads. It is well supported, so no reason not to use it.
Set aside a machine and disk for your own testing of various filesystems, if you want to find out what is best for your typical work load. Working the test load in steps over the entire disk can tell you something about how the filesystem being tested works.
Testing your load on your machine is the only way to be sure.
XFS is an amazing filesystem, especially for large files. If your load involves lots of small files, cleaning up any fragmentation periodically may improve performance. I don't worry about it and use XFS for all loads. It is well supported, so no reason not to use it.
Set aside a machine and disk for your own testing of various filesystems, if you want to find out what is best for your typical work load. Working the test load in steps over the entire disk can tell you something about how the filesystem being tested works.
Testing your load on your machine is the only way to be sure.
answered Apr 29 at 17:09
casualunixer
4651716
4651716
add a comment |Â
add a comment |Â
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f440748%2fbig-data-what-is-the-right-filesystem-ext4-or-xfs%23new-answer', 'question_page');
);
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Both work fine. I'm running a cluster using both (mostly because the provision scripts create XFS and we forget to reformat the disks before installing Hadoop)
â cricket_007
Apr 30 at 4:11
I assume that you mean Red Hat Enterprise Linux, not 7.2 Red Hat Linux 7.2, right?
â mattdm
Apr 30 at 5:22
yes we have redhat 7.2 version
â yael
Apr 30 at 10:04